IMPERIAL AGRICULTURAL 
RESEARCH INSTITUTE. NEW DELHI. 









THE ANNALS 


of 

MATHEMATICAL 

STATISTICS 


(Printed in U. S. A.) 


Copyright IsFf* 


349 


■Mlllllllll 

lAKI 



_^blished and Lithoprinted by 

EDWARDS BROTHERS, INC. 

ANN ARBOR. MICH. 





EDITORIAL COMMITTEE 


H. C. Cai-ver, Editor 

J. W. Edwards, Business Manager 


A quarterly publication of the American Statistical Association, 
devoted to the theory and application of Mathematical Statistics. 


Six dollars per amium. 


Reprints of any article in this volume may he obtained at any time 
from the Editor ai the following rates, postage included. 


Number of Copies Cost per Page 


1-4 . . 2 cents 

5-24 . . 154 cents 

50 and over . % cent 

25-49 . . 1 cent 


Address: Editor, Annals of Mathematical Statistics 
Post Office Box 171, Ann Arbor, Michigan 




CONTENTS OF VOLUME II 


The Relation lielween Stability and Homogeneity ... 1 

By L V. Bortkicwic:: 

Bayes’ Theorem. 23 

By IL C. Molina 

On Oortain Pro])crlics of Frequency Distributions 01)tained 
by a I-inear bh-actional Transformation of the Variates 
of a Given Distribution.38 

By If. L, Riel:: 

On Small Samples from Certain Non-Nonnal Universes . 48 

By Paul R. Rider 

An Empirical Determination of the Distribution of Means, 
Standard Deviations, and Correlation Coefficients 
Drawn from Rectangular Populations.66 

By Ilildo Prost Dunlap 

The Interdependence of Sam]ding and Frequency Distribu¬ 
tion Theory . ..82 

lidiiorial 

Note on the Distriliution of Means or Samples of N Drawn 

from a Type A Population.99 

By Cecil C. Craig 

On Symmetric Functions and Symmetric Functions of Sym¬ 
metric Functions.102 

By A. OToolc 

Fundamental Formulas for the Doolittle Method, Using 

Zero-order Correlation Coefficients.ISO 

By Harold D, Griffin 

On a Property of the Semi-invariants of Thiele .... 154 

By Cecil C. Craig 











CONTENTS OF VOLUME II—Continued 


The Theory of Observations.165 

By T. N. Thiele 

Correction for the Moments of a Frequency Distribution 

in Two Varial)]es.309 

By William Dowell Baten 

The Standard Error of a Multiple Regression Equation , . 320 


By John Rice Miner 

Sampling in the Case of Correlated Observations . . . 324 
By Cecil C. Craig 

The Relation between the Means and Variances, Means 
Squared and Variances in Samples from Combina¬ 
tions of Normal Populations.333 

By G. A. Baker 

A Table to Facilitate the Fitting of Certain Logistic Curves 355 
By Joshua L. Bailey, Jr, 


The Generalization of Student’s Ratio.360 

By Harold Hotelling 

Systems of Polynomials Connected with the Charlier Ex¬ 
pansions and the Pearson Differential and Difference 
Equations.379 

By Emanuel Henry Hildebrandt 

A New Formula for Predicting the Shrinkage of the Co- , 
efficient of Miiltiple Correlation.440 

By R, J, Wherry 

The Use of the Relative Residual in the Application of the 

Method of Least Squares.458 

By Walter A, Hendricks 











THE RELATIONS BETWEEN STABILITY 


AND HOMOGENEITY* 


By 

L. V. Bortkiewicz 


The idea of investigating the stability of statistical frequen¬ 
cies from the standpoint of the theory of probability goes back 
to the French mathematician Bienayme. From •various examples 
taken from social and moral statistics, he was the first to estab¬ 
lish the fact that, almost withowt exception, the stability in ques¬ 
tion was essentially less than the “classical norm,” that is, less 
Uian the expectation which is associated with the classical scheme 
of independent trials with a constant underlying probabUity. In 
order to explain this discrepancy lictwecn theory and observa¬ 
tion, Bienajnne used a modification of the traditional procedure 
which was characterized by the assumption that between neigh¬ 
boring trials in a time ordered sequence a sort of dependence 
existed. Though interesting in itself and among other things 
adopted by Cournot as his own, we shall replace this method in 
what follows by another, originating from Lexis, which has the 
advantage of a wider usefulness, in that it can be applied not only 


‘TratBlatcd by A. R. Cr.athinic. It cad before tlic American Statistical As- 



2 RELATIONS BETWEEN STABILITY AND HOMOGENEITY 


to imdulatory but to evolutory sequences.^ 

Let us assume that for a series of 2 successive time inter¬ 
vals, say years, we have found that some event (accident, death, 
marriage, crime) has happened oc;, . times, 

and that the corresponding number of “trials,” that is the ntim- 
bers of persons observed, are s,, s« i • . * -so that the 

quotients j/, • ^ ..represent 

a time ordered sequence of relative frequencies. Instead of as¬ 
suming, as the traditional theory demands, that each term of 
this series corresponded to a common fundamental probability p , 
weighted with accidental errors, Lexis assumed that each value 
was associated with a distinct probability , 

As a result of this, the expected amplitude of the fluctua¬ 
tions of the values ^ increased, and the greater the varia¬ 
tions in the p^’s the greater the amplitude. Under the sim¬ 
plifying hypothesis - const. ( * S ), the corresponding 

standard deviation c is defined by 

k»t ^ 

For the case of a constant p we may write 


( 1 ) 

where E .denotes “expectation.” In the Lexis procedure witit 
a variable , using the notation 




, ■ .a 

s «. 


it 

k-t 




kit " 


*Bicnayine, in the journal ‘Llnstitute,” Vol. 7 (1831), pages 187-189, and 
in “Journal de la Societe de Statistique de Paris,” 17e (1876), pages 199-204. 
A Cournot, Exposition de la theorie des dbances ct des probabilities, Paris, 
1843, Nos. 79 and 117. 

W. Lexis, “Uber die Theorie der Stabilitat statistischer Reihen,” in the 
Jahjhuch fur Nationalokonomie und Statistik, Vol. 32 (1879), pages 60 . ., 
reprinted in Abhandlungen zur Theorie dcr 6ev5lkesungs und Moral stat¬ 
istik, Jena, 1903, pages 170-212. 



L. V. BORTKlBWiCZ 


3 


the corresponding relation 

(2) E fa*) => </*+ -V/" 

can be derived.^ 

In the following numerical examples the numbers of observa¬ 
tions are never less than some ten thousands, while 2 10. 

Hence, as far as these and similar examples are concerned, the 
numerical results are not appreciably altered tf, instead of (2), 
we use 

(3) e f<r«)-u**a>* 

However, a certain inaccuracy arises, if, in the application 
of formula (3) to the raw data, one has disregarded the funda¬ 
mental assumption that is constant and in the expression for 
has replaced 5 by the arithmetk mean of the ^ values 
. If, however, the latter differ little from one another, such 
a procedure gives rise to no great discrepancy. Lexis called the 
quantities u and u? in formula (3) the two “fluctuation com- 
pc»nents,'’ which ccmihtne (according to the law of composition of 
forces) to give the expected total fluctuation. The quantity a 
gives expression to the effect of the “accidental causes"’ in the 
sense of tlie theory of probability, and this effect gn>ws less aitd 
less with increasing ^ until it vanishes for a * eo « For tins 
reason Lexis called u the normal component. He also used 
the term “unessential fluctuation component.” On the other hand, 
44? depends on the variations of the fundamental probability, that 
is on the underlying general conditions, and in this sense was 
designated by Lexis as the physical component« We may also 

*Onc does not find formula (2) in Lexis's work. He was satisfied at this 
point with a rather inexact method yielding an approximate result. How¬ 
ever, this did not affect the essential part of his discussion. 



4 RELATIONS BETWEEN STABILITY AND HOMOGENEITY 


call it the essential component. 

The first of the two components u and iO can be easily 
calculated directly with sufficient approximation. The usuAl 
method is to substitute for the unknown p in the expression for 
a* the value ij , the arithmetic mean of the frequencies , 
obtaining 

( 4 ) 

As for the second component , it is calculated by the in¬ 
direct method of substitutit^ <7* for £ ( a*) in (3) and then 
cjj is found from • This method, however, 

assumes that »u , or what is the same thing, that the dis¬ 
persion coefficient, Q’^ ^ , is greater than 1. In his older papers, 
Lexis distinguished between subnormal, normal and supernormal 
dispersion, according to whether Q was distinctly less than 1, 
aj^mndmately. equal to 1, or distinctly greater than 1, and found 
that in social and moral statistics the subnormal dispersion never 
occurred and the normal rarely. Supernormal dispersion was the 
rule. So Lmcis based his scheme of a varying underlying prob¬ 
ability on the case of supernormal dispersion. In fact, from 
formula (3). we have 

<S) 

which says that the variations in the underlying probalxlity lead 
us to expect values of greater than unity.' 

Notwidistanding the fact that tp was usually greider rt yiw 
unity, L^xis did not consider this a proof diat his sdienie ade* 

*Under the influence of accidental causes, Q may be leas than cnity not 
orfy for constant, but also for varying underlying i»rofaabilities, and thia 
dFcmufttance must be considered in the determiiiation of It would 
carry us too far afield to go further into this matter. 



L. V. BORTKIEWICZ 


5 


quately described the actual facts. In addition to this he was more 
concerned with the fact that in experience Q showed a tendency 
to decrease with decreasing number of “trials/’ that is with de¬ 
creasing 3 . Indeed, in a series of examples, Lexis had shown 
that a value of Q which was decidedly greater than unity when 
calculated for an entire country, decreased to nearly 1 when the 
data for the single administration districts of the same country 
were used. Lexis considered such behavior of Q as entirely in 
harmony with his scheme. 

If we write formula (5) in the form 

< 6 ) 

we see that the excess of Q^ovtx and above 1 is in expectation 
directly proportional to s . This was the explanation of the 
decrease of Q with decreasing 3 , for as Lexis said, we have 
no ground to expect that 3 being large or small had any bearing 
on the value of u? . 

It is this last point about which the criticism of Lexis’s dis¬ 
persion theory centers. Notwithstanding the endeavors of Lexis 
to fit his theory to statistical reality, we can show that the facts 
were against him as far as his assumption that co is funda¬ 
mentally independent of s is concerned. If this assumption were 
true, then formula (6) tells us distinctly how Q decreases with 
diminishing s . We learn from experience that as a rule this 
decrease in Q is less than that given by the formula; from which 
it follows that the essential component, co , has a tendency to 
increase with decreasing 3 . 

If we d-^sire to investigate just what happens in reality, a 
certain complication arises, because we are never able to cc«npare 
groups which differ among one another as to s , but not as to 
p y order to eliminate to some extent the varia- 

tiems of p we consider the ratio of a> to /> . Let ^ * /3 , 



6 RELATIONS BETWEEN STABILITY AND HOMOGENEITY 

and call /3 the relative essential cc«nponent to distinguish it from 
the absolute essential component a? . Formula (6) then becomes 
the following: 

(7) 5(Q^)-l+op 

The product §p can be considered as the expected number 
of “successes.” For a constant ( “ s ) we have 

sp,^, E xj » sp 

k^f 

and, letting 'S « ^ ^ , the last relation is true with sufficient 

approximation for a variable provided the variation is not 
too pronounced. Let 6p»m . Often, as in the examples 
which follow, p is so small that we can consider ( /^p ) as 
equal to 1, Fbrmula (7) then becomes 

§rj 

The question as to whether there is a connection between s 
and CO is now changed to an investigation of the relationship 
between W and /5 . In undertaking such an investigation em¬ 
pirically, we compare as to the behavior of m and /Q a statis¬ 
tical aggregate considered as a total with its component parts 
considered as partial aggregates. Let the number of the partial 
aggregates be 77 , and let the corresponding values of rry and 
p as well as u , and cr be indicated by the subscript i , 
which can also serve as the ordinal number of the partial aggre¬ 
gate. For the total aggregate, let O . The symbols 

^ Pi^k » are the ^ , or , y . p of the 
< th partial aggregate and the k th time interv'al. We also use 



L. V. BORTKIEWICZ 


7 


the notation 


®i * H Z -5, 4 : , -Xi "z ^i.k, 


yi-iLyi.k> Pi^iZ Pc* 

f h»l k^J ' 


from which we have 


i. . 


JC„ • Z -r, , 

I*/ 


• ^*/ jtf 

We have also the following relations ’ 

Z ^yc,k-yi> > 




^here Pi^k''Pi ^ A />• 


£(<r‘)~uf^a^ , <?.-f: - 

and using the notation g we have further 



8 RELATIONS BETWEEN STABILITY AND HOMOGENEITY 


Finally, corresponding to formula (8), we have 


(9) 




We shall now apply these formulas to statistics on the fre* 
quency of suicides in Germany for the decade 1902-1911. The 
numbers of “trials,” Sf ^ , are here the populations of the 

regions in question; the “successes,” cc^ , are the numbers 
of suicides for each year. The relative frequencies, ^ , are 

found by dividii^ the numbers of suicides by the corresponding 
populatitms. Like various other lands of social phenomena, the 
suicides in pre-war German statistics were grouped according to 
states, the provinces of Prussia, right Rhenish Bavaria and left 
Rhenish Bavaria being included as states. In this way we have 
forty territories of very unequal size. For the decade 1902-1911, 
the mean population of the territories ranged from a maximum 
of 6,587,000 (Rhine Province) to a minimum of 45,000 (Schaum- 
burg-Lippe). The maximum average number of suicides per 
annum was 1453 (Saxony) and the minimum 7 (Schaumburg- 
Lippe). Corresponding to the purpose of the investigation, these 
suicide figures , which can be considered as approximations 
to rrt ^, were arranged in descending order, with jc,«14S3 and 


For the whole of Germany, we have » 13173, 
21410 ■« (that is an average number of 214 suicides per 
annum for each million population). The ten values u ^ vary 
between 204-10-« and 223-10-*. These fluctuarions are J^rkedly 
^ater than one expects from the classical norm. The calcula- 
^ of the dispersionMiuotient gives = 3 . 14 , and, as the 
1^ ^ry demands, is greater than any one of the 40 values 
Thes e values give 2.03 as a maximum and 0.75 as a 

yields much the 

by the^LsiS “ in Germany. (Note 



L. a BORTKIEWICZ 


9 


minimum. Fixing attention on the eight smallest values of , 
we find an average value of 1 02 for , and of the eight values, 
three are larger and five less than L So in this example the dis¬ 
persion becomes very nearly 1 by narrowing the observation field. 
But we have still to find out whether decreases with 

CC 4 according to the measure of decrease that one would expect 
under the hypothesis that is fundamentally independent of 
, To decide this question, we let *= const. «^ , in¬ 
cluding ^/S » and substitute also for m^ in formula 

(9). We have then on the one hand in expected values 

-'^#7 


and on the other hand 


T,ZQ:-i^hT- 



from which follows 





However, in our example, we find 


nZ 1.56, / ^ ^ rQ‘- 0 -1.22 

and the difference 0.34 cannot be ascribed to chance for it is three 
times the probable error (the determination of which we cannot 
now take up). We must, then, assume that the average of the 
values , for ^ = 1 to 40, is greater than A . Why this 
is so we shall see in the following discussion. 

We consider now the mutual relationship between the de^da- 
tions ^ and £j ^ which refer to two arbitrary territories 
Ni and Nj , and we build up according to the formula for a 



10 RELATIONS BETWEEN STABILITY AND HOMOGENEITY 
correlation coefBcient the expression 




7 

tt A/St, 


The number of combinations of the subscripts i and J is 
, so there are that many values . Finally we 

construct a weighted arithmetic mean of these values according to 
dte formula, 

y. fey T-«V/ __ 

t I "V A 


The expression y serves to characterize the mutual relation- 
ship of time ordered series of fundamental probabilities ^ , 
hence also of relative frequencies y i§( ^ which may be con¬ 
sidered as approximations to ^ . If we give the name 
“ 83 mdromy” to such an array of simultaneously distinct fundamen¬ 
tal probabilities (or relative frequencies), we may call y a 
^^coefficient of syndromy.” For 1, we shall speak of ‘4sod- 
rc«ny,” for 1 ^ ^ 0, of “homodromy,” for */ « 0, of *‘para- 

dromy,” and for y< 0, of “antidromy.** We may include the 
last three cases, namely y < L under the name “anisodromy/' 
With the help of y we can exhibit the relation between 
^ on the one hand and the 7 ? values t/5^ , , , /3 on 
the other hand as follows: 





11 


Since 

I. V. BORTKIEWICZ 

, we find for 

(U) 

t=f 

and for 

/<1 

(12) 



L '"i 


y» 1, from (ID) 


Hence, only in the case of isodromy is tlie assuniption justi¬ 
fied that the relative essential fluctuation component for the total 
aggregate is as large as that for the partial aggregates. In every 
other case, namely for anisodromy, the relab've essential com¬ 
ponent for the total aggregate falls below the level for the partial 
aggregates more and more as */ becomes less and less. 

In the suicide example under consideration we have hom- 
odromy, which is reasonable, since the fluctuations in suicide fre¬ 
quency in the single states are influenced in part by factors which 
are not local but general for all Germany Somewhat tedious 
calculations give V* 0.38. At the same time we find 
A = 0.0246 approximately, while the average for » 1 

to 40 is 0,0392. 

If now we group the 40 states into five groups so that states 
numbered 1 to 8 form the first group, states numbered 9 to 16 
the second, and so on, we find as average values of / 3 ^ , 0.0354, 
0.0358,0.0485, 0.0528 and 0.0767. The quantities /3^ tfiMt show 
a tendency to increase as oc, (or m^) decreases. 

If, as in this example, the total aggregate is a “natural unit,” 
we should expect to have homodromy in the vast majority of 
cases. On the other hand, we should expect paradromy if the 
total aggregate is an “artificial unit,” that is, one made up by 



12 RELATIONS BETWEEN STABILITY AND HOMOGENEITY 

throwing together entirely unrelated groups. As an illustration 
of paradromy we take the array of marriage frequencies for the 
six dties, Barcelona, Birmingham, Boston, Leipzig, Melbourne 
and Rome, for the decade 1899-1908. By marriage frequence 
we mean the ratio of the number married (twice the number of 
marriages) to population. 

For tne six dties taken as a whole, with a total population of 
aoour three million, the marriage frequence varies be- 

cv.een 18JOO and 19.02 per cent with an average of 18.38 per cent. 
The dispersion coefficient is 3.17. For the six cities taken 
singly in the above order, each with a population of about half a 
tmUion, the values of are 2.69, 4.32, 4.17, 2.88, 3.76 and 
272^ witii an average 3.42, somewhat higher than This 

restdt is a direct contradiction of the statement of I-exis that a 
narrowing field of ob^rvation reduces the value of Q*. Lexis, 
without giving the matter much thought, worked with the hy¬ 
pothesis that isodromy, or at least a dedded homodromy, always 
existed. In our example, however, we have paradromy, if not 
antidromy, for we find / to be -0.054, Corresponding to this, 
we have less than each of the values A to A for A 
aKJroximates 0.0167 while A , < * 1 to 6, lies between 0.0334 

and 0.0563. The quadratic mean of these quantities is 0.0450, 

It is of prime interest to investigate for paradromy the theo¬ 
retical relation of /3o to the quadratic mean of the values A/» 
A • • • A and of Qo to the quadratic mean of Q, , 
Qi ' ' ' Qn ’ " const. - m . In this 

case, TrTo'^rTrTT , and if O is substituted for ■/ in (10) we 
have 

A'- 


the same time fine on the one hand, from (9), the ex- 



L. V. BORTKIEWICZ 


13 


pected value 

or 




and on the other hand 


whence 


* * 


In the marriage frequence example, where the quantities » 
though not equal, differ very little from one another, we have the 
values already found 


A = 0.0167 and Q 
to compare with the values 

and 

i Z Qt^ ^ 

1“! 

The differences 0.0167-0.0184=-0.0017 and 3.17-3.4^-<1.32 
are explained partly by the fact that the assumption nif < const, 
is not exactly in accord with the facts, and partly because para- 



14 RELATIONS BETWEEN STABILITY AND HOMOGENEITY 


dromy is really not present as assumed, but only a weak antidromy* 
This last should, however, be considered as due to chance. The 
artificial character of a total aggregate shows itself in paradromy. 

Of the two quantities and ^, only the latter can be 
considered as a proper measure of the stability of a statistical 
frequency—more exactly, of the corresponding fundamental 
probability. And, since on account of formulas (11) and (12), 
the total aggregate can never show a higher value of j3 than 
the average for the partial aggregates (because the upper limit 
for / is 1), we obtain a glimpse of the question of the connec** 
tion between stability and homc^eneity. 

The idea of homc^eneity as we here understand it has refer¬ 
ence to the result of the decomposition of a statistical aggregate 
according to some attribute or complex of attributes. The aggre¬ 
gate may consist of 3 elements, say s human beings and 
the decomposition may yield N sub-aggregates containing 
s‘ , 3*' ... elements. Let some event A be observed 
or times in the total aggregate and oc\ . . , times in 

the sub-aggregates. If we find the relative frequencies 






y “ 


x" 


^hen, on account of the two identities, . . •S, 

and .X * #■ X . . . »jc , we have the relation 


s yV .. .. 

—r- z - 

The “general frequency” then appears as the weighted arithmetic 
mean of the “special frequencies,” y‘ , y” , . 

The theory of probabilities, with more or less assurance, fur» 
nishes us a criterion for deciding whether or not the deviations 
of the quantities if' , y\ .. . from ^ are due to chance. 




L. V BORTKIEWICZ 


15 


If they are not due to chance we say that the total aggregate 
“reacts^' to the decomposition in question and that the attribute 
or complex of attributes which governs the decomposition is 
‘'relevant/* If they are due to chance, we say that the total aggre¬ 
gate does not react to the decomposition and that the attribute 
is “indifferent/* 


According to the standpoint of the theory of probability, the 
relative frequencies y , y *, y'* . . . as also the quotients 
^ . can be considered as approximations of 

distinct probabilities. If we designate the two series of probaUl- 
ittes thus inferred by p.yo*, p", , . . and g\ • 

respectively, we find 


( 13 ) p ~•- 

and the character of the attribute in question as relevant or in¬ 
different finds expression in the fact that the “special probabilities*’ 
p*^ p\ . . . either differ from one another or are all equal 
to p , the “general probability ** 

For every ample enough complex of attributes we can imagine 
the decomposition going on and on by applying one attribute of 
the complex after another. Finally a point is reached where the 
sub-aggregates no longer react to further decomposition, or, ex¬ 
pressed otherwise, the supply of relevant attributes is exhausted, 
and the probabilities p\^p*" ^ . . . which are as<iodated 

with these sub-aggregates are called “elementary probabilities/* 
In this case we say that the sub-aggregates themselves are “com¬ 
pletely homogeneous** with reference to the event A . 

The total aggregate—still in reference to A —^is the more 
diversified the more the elementary probabilities p \ p\ . . , 
differ among themselves, that is, the more they differ from p • 
It is reasonable to take as a measure of this diversity the expression 



16 RELATIONS BETWEEN STABILITY AND HOMOGENEITY 
& , defined by 


( 14 ) < 5^9 rp -p)^*s - 


Diversity and homogeneity are antithetical notions; the more 
undiversified the aggr^te, the more it is homc^eneous, and vice 
versa. 

In order to apply this view of homogeneity, now considered 
for itself, to the procedure and the examples which we have 
brought forward in the discussion of stability, we must disregard 
the time fiuctuatiotts of the probabilities in question. Tlud is. 
we do not use the quantities ^ but fix attention on the 
probabilities pi which refer to an individual time interval of 
n partial intervals—say a decade. By carrying otft repeatedly 
the decomposition according to formula (13), the quantities 
Pi • Pa included may be expressed in the form 


Pi ’‘Si P, *$i P, 


where p‘ , p7 ... are elementary probabilities. Cor¬ 
responding to formula (14), we have 


(IS) 



If we designate the proportion of the i th partial aggregate 
to the total a^regate by c, , that is, if we let |» s , 
we find “ 



L V. BORTKIEWICZ 


17 


and at the same time 



The number of summands in (16) is /? /V , since there are 
Y! partial segregates and each of these is a totality of N sub* 
aggr^tes. It may easily occur that some of the n N elemen¬ 
tary probabilities are equal and this is expected in connection 
with elementary probabilities which are associated with similar 
sub-aggregates. But even in the most extreme case, where the 
elementary probabilities are equal without exception, we cannot 
say that the probabilities p -^ are all alike. This can occur only 
when the values , g* , . . . are independent of i . 
This highly inq>robable case is excluded from our discussion. We 
have then 

a 

( 17 ) a (pi- pS o 


From (15) and (16), we have the following: 

9/ (Prp/* $i^PrpJ*^ — (Pi-Po>* 





18 RELATIONS BETWEEN STABILITY AND HOMOGENEITY 


so that, on account of (17) 



and a fortiori 
(18) 



The total aggregate is then under all circumstances less 
homogeneous than the partial aggregates are on the average 

This statement might possibly correspond to the every-day 
t Tipaning of the word “homogeneity,” which carries with it no pre¬ 
cise quantitative idea. Indeed, when we consider that in the case 
of the total aggregate we have to take into account not only the 
lack of homogeneity within the partial aggregates, but also the 
diversity with which the partial aggregates may make up the 
.whole, we are inclined to say that the total segregate is less homo¬ 
geneous than any of its parts. With that idea, however, we do 
hot hit upon the right thing as far as our mathematical criterion 
of homogeneity is concerned. The inequality (18) says only, 
that the average of the values 8,, is less than 

not that each one is less than S„. 

In our foregoing discussion of stability as measured by the 
relative essential fluctuation component, we found that for the 
total aggregate the stability was higher than the average for the 
partial aggregates, except for the case of isodromy, which in prac¬ 
tice rarely occurs. Hence, there exists between homogeneity and 
stability an antagonistic relation-small homogeneity goes hand 
in hand with great stability. For example, the provinces into 
which a country may be divided will show, on the average, a 
greater homogeneity and at the same time a lesser stability in 
reference to an event A than will the country taken as a whole. 



L. V. BORTKIEWICZ 


19 


Again, the districts into which the provinces may be divided will 
on the average show a greater homogeneity associated with a 
still smaller stability. We can say that in general the homo¬ 
geneity increases with the narrowing of the field of observation, 
while the stability decreases. 

Is this to be considered as a warning against the all too popu¬ 
lar diversification of statistical material which being more and 
more accepted in research methods? Not in the least. That 
would be an obsolete point of view, as if the problem of statistics 
consisted in a search for most stable values. Rather does the 
opposition between homogeneity and stability give direction to 
business practice, especially to that branch of business which is 
in such close touch with statistics, namely insurance, where sta¬ 
bility is of prime importance. It has been known for a long 
time that it contributes to the even tenor of Ihe business side 
if the risks are as heterogeneous as possible. It is of advantage 
if the insured persons or things are spread relatively widely ac¬ 
cording to geographical and other points of view, instead of con¬ 
centrating on a limited territory or few kinds of risks. 

Accordingly, even if this thesis, that an antagonistic relation 
exists between homogeneity and stability, seems surprising and 
strange, we find on closer consideration that the theory agrees 
with a practice which has instinctively grasped the true situation. 
It is now twelve years since I had the first opportunity to explain 
at greater length than here the foregoing developed ideas and 
with the verif)ring data to present them to my colleagues.^ As 
far as I know, only one of these has taken a definite stand in 
the matter. This is John Maynard Keynes,^ He makes the 
charge against me, that instead of clearing up a very simple mat¬ 
ter, I have befogged it with a profusion of mathematical formulas 

'Homogeneitat und Stabilitat in der Statistik, in the Skandinavisk Aktu- 
arietidskrift, 1918, pages 1-81, Upsala. 

*A treatise on probability, London, 1921, pages 403-405. 



20 RELATIONS BETWEEN STABILITY AND HOMOGENEITY 

and new technical terms, and he believed that he could show this 
best by an example of my own from the field of insurance. In 
referring to this example, Keynes thought that the distinction 
made by myself in a much earlier publication bet\^een a gen¬ 
eral probability p and the special probabilities p,i • * • 
was the one in question, where 


z 


A 


2 


Pa+- 


Keynes further expressed himself as follows: 


^'If we are basing our calculations on p and do not 
know p,, p^, etc., then these calculations are more 
likely to be borne out by the result if the instances are 
selected by a method which spreads them over all the 
groups 1, 2, etc., than if they are selected by a method 
which concentrates them on group 1. In other words 
the actuary does not like an undue proportion of his 
cases to be drawn from a group which may be subject to 
a common relevant influence for which he has not allowed. 
If the a priori calculations are based on the average over 
a field which is not homogeneous in all its parts, greater 
stability of result will be obtained if the instances are 
drawn from all parts of the non-homogeneous total field, 
than if they are drawn now from one homogeneous sub¬ 
field and now from another. This is not at all para¬ 
doxical. Yet I believe, though with hesitation, that this 
is all that Von Bortkiewicz’s elaborately supported math¬ 
ematical conclusion amounts to.” 


Suppose, for example, that a fire insurance company insures 


‘Here z refers to a 
into groups of , 
2 ' 2 ,+ 2 ^ - 1 ^ . . . 


series of “equally likely events,” which is broken up 
• • • • equally likely events. Hence 



L. V. BORTKIEWICZ 


21 


two kinds of buildings, dwellings and factories, which are classified 
as different grades of fire risks, for insurance premiums which 
are not graded. The premium is to be calculated per unit on 
the supposition that the risks in the two categories are divided 
in a definite proportion. Then, according to Keynes, a greater 
stability in the business is guaranteed if every year dwellings as 
well as factories are insured, than if in one year only dwellings 
and in another year only factories are insured. This is certainly 
true and requires no lengthy argument. But it has nothing what¬ 
ever to do with my thesis of the antagonistic relation between 
stability and homogeneity. 

To give an example which does illustrate my theory, think 
of three insurance companies, A, B, and C. A insures only 
dwelling houses, B only factories, while C insures both. The 
premiums in A, B. and C are different because of the different 
classes of risks. It is assumed in C that there is no grading of 
premiums. A premium per unit is charged which is calculated 
according to the relative number of the two risks. The premium 
is to be just high enough so that for a period of years, allowing 
for variations due to chance, the damages are just covered. In 
the course of this period, the danger of fire varies from year to 
year, showing gains in some years, losses in others. Such fluctua¬ 
tions of fire hazard would correspond in my scheme to the varia¬ 
tions of the probabilities with respect to k , while ^ 

is associated with A, ^ with B, and ^ with C. And in 
accord with my theory that, except in the case of isodromy, the 
values ^ , relatively s])caking, show weaker variations than 

p^ ^ and ^ do on the average, the insurance company C 
would show relatively smaller fluctuations of fire damage from 
one year to another, resulting in a more stable business than 
would be shown by the average of A and B, The mixed charac¬ 
ter of the risks would be conducive to greater stability. In the 
case of C a certain compensation of effects would take place 



22 RELATIONS BETWEEN STABILITY AND HOMOGENEITY 


which the time variations of the two-sided fundamental probabil¬ 
ities would make manifest on the business side.^ But Keynes 
says nothing of these variations. He simply missed the point of 
my argument and his remarks were not relevant. 

It is to be hoped that the new exposition of my theory, 
although, or because, it is essentially shorter than the older one, 
wilt give no cause for a similar misunderstanding. 


*Tht$ compensatioa would also appear in the more complicated case where 
the proportions o{ the risks in c are not unchangeable as is assumed in 
the text, but would change from year to year (the premium being adjusted 
accordingly). We need not go further into this matter because, in my 
theory, the composition of s,.,, out of the component parts is 

considered as fixed. In my examples, this composition varied, ^t the 
fiactuations were insignificant in comparison to the variations of the values 
. See Skandinavisk Aktuarietidscrift, pages 69-70. 





BAYES’ THEOREM 


An Expository Presentation* 


By 

Edward C. Mouna 

Amencan Telephone and Telegraph Company 


Bayes’ theorem made its appearance as the ninth preposition 
in an essay which occupies pages 370 to 418 of the Philosophical 
Transactions, \''ol, 53, for 1763. An introductory ietter written 
by Richard Price, “Theologian, Statistician, Actuary and Political 
Writer,”* begins thus; 

“I now send you an essay which I have found 
amongst the papers of our deceased friend, Mr. Bayes, 
and which, in ray opinion, has great merit, and well 
deserves to be preserved." 

A few lines further on Price says: 

“In an introduction which he has writ to this Essay, 
he says, that his design at first in thinking on the subject 
of it was, to find out a method by which we might judge 
concerning the probability that an event has to happen, in 
given circumstances, upon supposition that we know 

*Riead before the American Statistical Association durin;; the meeting^ of the 
American Association for the Advancement of Science in Cleveland, Ohio, 
December, 19.t0. 

* These titles are associated with the name of Price in the frontispiece por¬ 
trait of him bound with the December, 1928, issue of Biometrika, 



24 


BAYES" THEOREM 


nothing concerning it but that, under the same circum¬ 
stances, it has happened a certain number of times, and 
failed a certain other number of times.*’ 

“Every judicious person will be sensible now that the 
problem mentioned is by no means merely a curious spec¬ 
ulation in the doctrine of chances, but necessary to be 
solved in order to assure a foundation for all our reason¬ 
ings concerning past facts, and what is likely to be here¬ 
after.** 

No one will dispute the importance ascribed to Bayes* problem 
by Price; irnfart, a paper by Karl Pearson on an extension of 
Bayes* problem is entitled “The Fundamental Problem of Prac¬ 
tical Statistics.” Opinions differ, however, as to the validity and 
significance of the solution submitted in the essay for the problem 
in question. In view of this situation I shall limit myself today 
to an exposition of the fundamental characteristics of the prob¬ 
lem Bayes’ theorem deals with and shall give no consideration to 
its interesting applications. 

The exposition may be outlined as follows: after specifying 
the class of problems to which Bayes* theorem piertains, I shall: 

I. Discuss briefly two problems, each of which will empha¬ 
size one of two kinds of a priori probabilities which should be con¬ 
stantly borne in mind when Bayes* theorem is under consideration, 

II. Partially analyze a certain ball-drawdng problem which 
will not only serve as an introduction to the algebra of Bayes* 
theorem but will later help to throw light on its significance, 

III. Present Bayes* problem and the related theorem. 

IV. Make some remarks on the value of the theorem and 
the controversies which it raised. 

In carrying out this plan I shall find it convenient to ignore 
the historic order of events. 

When probability is the subject under consideration one an- 



JS. C. MOLINA 


2S 


ticipates problems such as: A coin is about to be tossed 15 times; 
What is the probability that heads will turn up seven times? A 
sample of 100 screwdrivers is to be taken from a case containing 
1000 screwdrivers of which 300 are known to be defective: what 
is the probability that the sample will contain 25 defectives ? 

These are direct, or a priori, probability problems. In each 
of them the nature of a game, or an experiment, is specified in 
advance and then a question is asked relating to one, or more, of 
the possible outcomes of the game or experiment. Problems of 
this type have occupied the attention of mathematicians since the 
days of Pascal and Fermat, the creators of the mathematical theory 
of probability. 

An inverse class of problems of great practical significance, 
called a posteriori probability problems, came into prominence with 
the publication of Bayes' essay. In these we find specified the re¬ 
sult or outcome of a game which has been played, whereas the 
question then asked is whether the game actually played was one 
or some other of several possible games. This type of problent 
is usually stated as follows: 

*‘An event has happened which must have arisen from 
some one of a given number of causes; required the prob¬ 
ability of the existence of each of the causes." 

I 

Gmsider this example: During his sophomore year Tom 
Smith played on both the baseball and football varsity teams; 
we have been informed that he broke bis ankle in one of the 
games; what are the a posteriori probabilities in favor of baseball 
and football, respectively, as the baneful cause of the accident? 

Evidently the answer depends on the number of baseball and 
football games flayed during their respective seasons and also on 
the likelihood of a man breaking an ankle in one or the other of 



26 


BAYES* THEOREM 


these two games. As a concrete case assume that; 

1. At Smith's college an equal number of baseball and football 
games are played per season; 

2. Statistical records indicate that if a student partidj^tes in a 
baseball game the probability is 2/100 that he will break an 
ankle and that, likewise, the probability is 7/100 for the same 
contingency in a football game. 

In view of the first of these two assumptions our conclu¬ 
sions as to the cause of the accident may be based entirely on the 
information contained in the second assumption. The odds are 
two to seven, so that the a posteriori probabilities regarding the 
two admissible causes are; 

For baseball, 2/(2-f7) =2/9. 

For football, 7/(2-f7) = 7/9. 

Now consider this other example. A lone diner amused him¬ 
self between courses by spinning a coin. We elicited from the 
waiter that in IS spins heads turned up seven times. Moreover, 
from our point of observation, the size of the coin indicated that 
it was either a silver quarter or a ten-dollar gold piece. What are 
the a posteriori probabiliiies in favor of the silver quarter and the 
gold piece, respectively? 

If the lone diner were a professor from one of our eastern 
universities we would not hesitate a moment in declaring that the 
coin spun was a quarter. But it happens that the gentleman was 
a member of the Cleveland Chamber of Commerce, dining at the 
Bankers’ Qub. We must, therefore, give the matter more careful 
consideration. The number of quarters and gold pieces usually 
carried by a banker and the probabilities of obtaining the observed 
result by spinning coins are relevant; let us assume, therefore, 
that: 

1. The small change purse of a Cleveland financier contains, on 

the average, ten-dollar gold pieces and quarters in the ratio of 



£. C. MOLINA 


27 


eight to three. 

Moreover, we may assume (in fact we know) that: 

2. If either a quarter or a gold piece is spun 15 times, the prob¬ 
ability that heads will turn up seven times is approximately l/S. 

The second of these two items of information makes the a 
posteriori probabilities depend entirely on the first item. Clearly 
the odds are eight to three and we conclude. 

For a quarter, a posteriori prol)ahility=3/(3'f8) =3/11. 

For a goldpiece, a posteriori prohability=8/(34-8) =^IL 

Now regarding the general a posteriori problem, 

“An event has happened which must have arisen 

from some one of a number of causes; required the prob¬ 
ability of the existence of each of the causes/' 

what do the two examples we have just considered suggest? In 
both problems we inquired into: 

1. The frequency with which each of the possible causes is met 
BEFORE THE OBSERVED EVENT HAPPENED, This frequency 
is called the a priori existence probability for the correspond¬ 
ing cause. 

2. The probability that a cause, if brought into play, would re¬ 
produce the observed event. This probability will hereafter 
be referred to as the a priori productive probability for the 
cause in question. 

In the case of the broken ankle, the a priori existence prob¬ 
abilities were equal and took no part in our conclusion; we based 
the a posteriori probabilities entirely on the a priori productive 
probabilities. We did just the opposite with reference to the coin 
spun by the Cleveland financier; on account of the equality of the 
a priori productive probabilities we deduced a posteriori prob- 



28 


BAYES* THEOREM 


abilities in terms of the unequal a priori existence probabilities. 

It is apparent that our two examples represent extreme cases. 
In general, the solution of an inverse or a posteriori problem, in¬ 
volving a number of causes, one of which must have brought about 
a certain observed event, depends on both sets of direct, or a priori 
probaWlities. Those of the first set give the frequency with which 
the various causes were to be expected before the observed result 
occurred; those of the second set give the frequencies with which 
the observed result would follow from the various causes if each 
were brought into play. 


11 

Bearing in mind the two distinctly different sets of a priori 
probabilities required in arriving at a posteriori conclusions re¬ 
garding the possible causes of an observed event, we must now 
give some thought to the algebra of the subject before taking up 
Bayes’ problem and theorem. For this purpose consider the fol¬ 
lowing bag problem: 

A bag contained M balls, of which an unknown number 
were wrhite. From this bag N balls were drawn and of these 7* 
turned out to be white. What light does this outcome of the 
drawings throw on the unknown ratio of the number of white 
balls to the total number of balls, M , in the bag? Let os be 
this unknown ratio. 

Two cases of this problem may be considered : 

Case 1.—^After a ball was drawn it was replaced and the bag was 
shaken thoroughly before the next drawing was made. 

2. ^A drawn hall was not replaced before the next drawing. 

These two cases become essentially identical when the total 
number of balls in the bag is very large compared with the num¬ 
ber drawn. Case 1 will serve as an introduction to Bayes* proh- 



E. C. MOLINA 


29 


km; later we will find it highly desirable to consider Case 2. 

We are confronted with ( Af +1) possible h 3 rpotheses or 
causes before the drawings took place; 

1 - the unknown value of x is x^ » O/M, 

2 - the unknown value of x is jc, =* 1/A7, 

3 - the unknown value of x is jr, =» 2 /A#, 

• • 4 • • « 

A'4- 1-the unknown value of oc is 

M -f 1 - the unknown value of x is » M/M ^ 

Let IV {Xff) be the fl priori existence probability for the k 'th 
hypothesis; by this is meant the probability in favor of the 4r^th 
hypothesis based on whatever information was available regarding 
the contents of the bag prior to the execution of the drawings. 
Let a(T, K be the a priori productive probability 
for the A ’th hypothesis; by this is meant the probability of ob¬ 
taining the observed result ( T whites in N drawings) when the 
value of X is k/M. 

Then, the a posteriori probability, or probability after the 
observed event, in favor of the A'th hypothesis is 

(1)1 p * s ( Z /V, 

^ a 3(7, N, JCt) 

For Case 1 of our bag problem we have 

AC * (V C/- JCj,) 


‘This is the Laplacian generalization of Bayes^ formula, although in some 
textbooks it is referred to as “Bayes* Theorem.** A relatively short dem¬ 
onstration of it is given by Poincare in his Calcul des ProbobUitSs^ See 
also Fry* Probability and its Engineering Uses, Art. 49. 



BAYESr THHOKHM 


30 

where (If) represents the number of combinatioiis of /V 
things taken r at a time. Substituting in (1), we obtain, 
after canceling from numerator and denominator the common 
factor , 

(2) p . jf/ 

If in equation (2) we give k successively the values a , 
a 1, d 2, . . • 6- h b and add the results, wo 
have 

P ^ p -f • + A 

or 


(3) 




’’ (/- x^) 
€*o 


for the a posteriori probability that the 'inknown ratio of white 
to total bails in the bag lies between ajH and b/M^ both 
inclusive. 


ni 

BAYES’ PROBLEM 

Consider the table represented by the rectangle ARC D in 
Fig. 1. On this table a line 05 was drawn parallel to, but at 
an unknown distance from, the edges AD and 3C * Then 
a ball was rolled on the table times in succession from the 



2, C. MOLING. 


31 


edge A D toward the edge QC . As indicated in the figure 
it was noted that T times the ball stopp^^u rolling to the right 
of the line OS and A/“ T times to the left of that line. 
What light does this inforination shed on the unknown dis¬ 
tance from AD to 05 ? In more tecirical terms, what is 
the a posteriori probability that che unknoun position of the line 
OS lies between any i positions la which may be interested? 


! 

i r) 

a 


o 

o 

o 

r 

2 . 


[ 

• • G 

0 

j 

e 



Fig.L 


Each rolling of the ball was executed in such a manner that 
the probability of the ball coming to rest to the right of O 5 is 
given by the unknown ratio of the distance OA to the length 
3A of the table; likewise, the probability of the ball stopping 
to the left of O S is given by the ratio of the distance fl O to 
the length BA . 

Set ac= OA/BA, \-x- &0/BA. 

The only diflference between this problem and the bag of balls 
problem is that now the possible values of nC are not restricted 
tothefinitesetO/A/, l//7,2//7, . . . 
in the table problem oc may have had any value whatever between 
the limits of 0 and 1. Therefore equation (3) will answer the 
question asked provided we substitute definite integrals in place 
of the finite summations. This substitution gives us, for the de- 




32 


BAYES’ THEOREM 


sired a posteriori probability that ■* had a value between x, and 
the formula 


< 4 ) 




f w(x)x^ (t-x) '^"^cCx 

^ ____ 

J' vr(x)x^ (/-xp ^dx 


Equation (4) is useless until the form of the o priori exis¬ 
tence function vrCxJ is specified; this depends on the way in 
which the line 0 5 was drawn. Bayes assumed that the line 
05 .of unknown distance from A D , was drawn through the 
pmnt of rest corresponding to a preliminary roll of the ball. This 
amounts to postulating that all values of x, between 0 and 1 
were a priori equally likely. In other words, with Bayes, the 
a priori existence function w^cc.lwas a constant which, therefore, 
did not have to be taken into consideratitMi.^ Thus, instead of 
equation (4), Bayes gave the equivalent of the following restricted 
formula: 


(S) 


p{x,,x^) 




I say “the equivalent of” (5) because in Ba]^’ day definite 
intends were expressed in terms of corresponding areas. 

Equation (S) amstitutes Proposition 9 of the essay, but is 
usually referred to as Bayes’ theorem. 

ejdstence function does not appear either explicitly or implic- 

itiy anywhere in Bayes essay. This fact raises the question as to whether 
or not Bayes had any notion of the ffwsero/ problem of causes. 



£. C. MOUNA 


33 


IV. 

Equation (5) is a very beautiful formula; but we must be 
cautious. More than one high authority has insinuated that its 
beauty is only skin deep. Speaking of Laplace’s generalization 
and extension of the theorem, George Chrystal, the English math¬ 
ematician and actuary, closed a severe attack on the whole theory 
of a posteriori probability^ with the statement that “Practical peo¬ 
ple like the Actuaries, however much they may justly respect 
Laplace, should not air his weaknesses in their annual examina¬ 
tions. The indiscretions of great men should be quietly allowed 
to be forgotten.” 

Chrystal’s advice as to the attitude one should assume toward 
"the indiscretions of great men” is excellent, but in the case under 
consideration, it was the plaintiff rather than the defendant who 
committed indiscretions; this is discussed in a paper by E. T. 
Whittaker^ entitled “On Some Disputed Questions of Probability.” 

The discussions and disputes, which began shortly after the 
birth of the formula in 1763 and which have not as yet subsided, 
may be divided into two classes: 

1. Discussions concerning problems in which it is known that the 
a priori existence function is not a constant. 

2. Discussions concerning problems in which nothing whatever 
is known concerning the a priori existence function. 

The discussions of Class 1 are out of order in so far as 
Bayes’ theorem is concerned; recourse should be had to formula 
(4), Laplace’s generalization of the Bayes’ theorem, when it is 
known that is not a constant Failure to differentiate 

**‘On Some Fundamental Principles in the Theory of Probability,” Trans^ 
actions of the Actuarial Society of Edinburgh, Vol. 11, No. 13. 

* Transactions of the Faculty of Actuaries tn Scotland, Vol. VIII, Session 
1919-1920. 



34 


BAY ns* THEOREM 


explicitly between equations (4) and (5) has created a great deal 
of confusion of thought concerning the probability of causes. The 
discussions of Class 2 have centered on what Boole called **the 
equal distribution of our knowledge, or rather of our ignoraiKe/* 
that is to say ‘‘the assigning to different states of things of which 
we know nothing, and upon the very ground that we know noth¬ 
ing, equal degrees of probability/' Regarding the legitimacy of 
this procedure Bayes himself contributed a very important schol¬ 
ium, which appeared in his essay on pages 392 and 393. The 
argument in this scholium, based on a corollary to Proposition 8 
of the essay, may be summarized as follows': 

Assuming that all values of Jt are a priori equally likely and 

that the A/ throws of a ball on the table have not yet been made, 
/ 

the probability that T times the ball will rest to the right of 05 
and that the remaining N^T times it will rest to the left of 
OS is (as shown in the corollary) 

(6) P-/V 


a result ki which T does not appear. In other words, any as¬ 
signed outcome for die throws is no more, or no less, likely than 
any other outcome, if a priori all values of X are equally likdy. 
But, wrote Bayes in the scholium, when we say that we have no 
knowledge whatever o priori regarding the ratio jc , do we not 
really memi that we are in the dark as to what will be the out¬ 
come when we proceed to make "W throws? If so, then equa¬ 
tion (6) justifies the assumption that o priori'&Sl values of jc are 
equally^ likely. 

To clinch his argument it must be shown that the converse 
of equation (6) is true. That is, it must be shown that, if any 
outcome of throws not yet made is as likely as any other, then 



E. C. MOLINA 


35 


any value of x is a priori as likely as any other. This converse 
theorem was submitted to Dr. F. H. Murray, who obtained an 
elegant proof based on a theorem of Stieltjes.^ 

In view of Bayes’ corollary and his scholium, an analysis of 
our bag problem with reference to the ^'equal distribution of our 
knowledge, or ignorance” is in order. 

Consider again Case 1 where each drawn ball is replaced in 
the bag before the next drawing is made. 

Assuming each of the ( A7 if-1) permissible hypotheses to be 
a priori equally likely, the probability that /V drawings, not yet 
made, will result in T white and N T black balls is 


(7) 



Equation (7) is not, in general, independent of T * so that 
any one assigned outcome of N drawings is not as likely as any 
other outcome. This result is disturbing; at first sight it seems 
to discredit Bayes’ scholium. We must, therefore, look into the 
the matter more closely. 

Bayes’ problem corresponds to drawings from a bag con¬ 
taining an infinite number of balls. Therefore, even if drawn 
balls are replaced, the chance of a particular ball being drawn 
more than once is zero. But when /V drawings with replace¬ 
ments are made from a bag containing a finite number, A/ , of 
balls, we are by no means certain of drawing ^ different balls; 

^Bulletin of the American Mathematical Society, February, 1930. 

* Consider, for example, the case of M ^ 2, Equation (7) reduces to 

a result which is not independent of T. 



36 


BAVES' THEOREM 


a particular white ball may be drawn several times over* and, like¬ 
wise, a particular black ball may appear more than once. It is not 
ourprisirij, therefore, that Case 1 of the bag problen^ does not 
corfinn Bayes’ corollary. 

Consider now Case 2, where the drawn balls are not returned 
to the bag. If A of the total balls are w^hite and the re^i black, 
the probability that a sample of /V balls from the bag will con¬ 
tain T white and A/- T bl^k is 

(f) [%■-%)/{I,) 

Hence, if the permissible values 0, 1, 2, 3, ... /^ for k 
are ail equally likely a priori, we obtain instead of (7), 

a result independent of any assigned value'for T and identical 
wth the result in the corollary to Proposition 8 of the essay. 


SUMMARY 

Bayes’ theorem is the answer to a special case of the general 
problem of causes. The special case-postulates that the a priori 
existence probabilities for the various admi.ssible causes of an ob¬ 
served event are equal. 

In the essay Bayes recommends that his theorem be adopted 
whenever we find ourselves confronted with total ienorance as 
to which one of several possible causes produced an observed 
event. To justify this recommendation Bayes takes the attitude 
that: A state of total ignorance r^rding the causes of an ob- 



E. C. MOLINA 


37 


served event is equivalent to the same state of total ignorance as 
to what the result will be if the trial or experiment has not yet 
been made. This interpretation is a generalization of ^^fact 
that in his billiard table problem, the assumption of equal likeli* 
hood for all possible positions of the line O S , gives equal prol> 
abilities for the various possible outcomes of a set of /V ball 
rollings not yet made. 

Laplace, Poincare and Edgeworth^ have shown that the a 
priori existence function v>^(x) , which appears in the Laplacian 
generalization of Ba3res* theorem, is of negligible importance when 
the numbers /V and T are large. Therefore, when this con¬ 
dition holds, one need not hesitate to use Bayes’ restricted formula 
for the solution of a problem of causes. 

The transmission, by Price, of Bayes’ posthumous essay to 
the Royal Society marked an epoch in the history of the literature 
on probability theory. As mentioned at the beginning of this 
paper, Karl Pearson has called the extension of Bayes’ problem 
the “Fundamental Problem of Practical Statistics.” 


‘Laplace: "Oeuvres,” Vol. 9, p. 470. Poineart: “Calcul des Probabilitda,” 
2i edition, p. 255. Bowley: "F. Y. Edgeworth’s Contributioa to Mat^ 
enatkal Statistics,” pp. 11 and 12. 





ON CERTAIN PROPERTIES OF FREQUENCY 
DISTRIBUTIONS OBTAINED BY A LINEAR 
FRACTIONAL TRANSFORMATION OF THE 
VARIATES OP A GIVEN DISTRIBUTION 


By 

H. L Rim 


Considerable evidence has been presented by R. A. Fisher^ 
to show that, by an af^ropriate transformaticMi z > f Cr) of 
small sample correlati(Mi coefficients r i /)distributed 

in accord with a decidedly sl^ frequency curve, values of z 
are obtained which are distributed nearly in a normal distriburion. 
In fact, the approach of the distribution of z to normality 
seems sufficiently rapid to justify the use of the pndable error 
of z in many apj^cations as if it were normally distributed. 
Sudi a change in the character of the distribution of an important 
statistic suggests the further study of properties of the distribu- 
ti<m of variables obtained by applying rather sim^de transforma' 
tions to variates distributed from —1 to +1 in accord with a given 
frequency function. In a previous paper,® the writer has de^l t 
with a similar problem when each variate of a given unimodal 
distribution of any finite rar^ is replaced by a given power of 
the variate. 

Consider a positive unimodal continuous frequency function 


‘ Metron, VoL 1, Part 4 (1921) pp. 3-3Z 

•Proceedings of the National Academy, Vd. 13, No. 12 (1927),817-820. 



H. L. RIETZ 


39 


4j ^ yr (oc) of a system of variates jc,, with 

a range of -1 to +1, with l/r (-1)= (1)=0, with a single 

mode at some point, say at ar=^ A (-1 < b < 1), and with the 
derivative yr (oc) continuous. More precisely, we assume 
that ijr (jc) is positive except at the end points at the in¬ 
terval -1 to +1, where it is zero, and that Tft* (x) changes 
from positive to negative at b , and is non-negative or 
non-positive at any point Xm<z according as €Z is less or 
greater than b . 

It is the main object of the present paper to consider certain 
properties of the distribution of variates ^ (e jc, ^ )/ 

{ qXi +h ) obtained by a linear fractional transformation of 
the x \ where e , f , g, and h are real numbers so selected 
that u • { ex ^ f )/( gx ) is continuous from 
Jr~l to X =1. 

When g =0, we have the case of the linear transformation 
which simply has an effect equivalent to a change of origin and 
of unit of measurement. As we are not in the present problem 
much interested in such a simple transformation, we shall, in 
general, assume g, ^ 0. Moreover, we take g positive, since 
this involves no loss of generality. 

We shall, except as otherwise stated, restrict our considera¬ 
tions to the interval for u that corresponds to -1 ^ jr • 1» 
and to such transformations that the derivative of u with re¬ 
spect to X is finite for each value of jc and that u increases 
when X increases. These restrictions require that 

c6u. ^ he-fa 
afoc ~ (gjci-n)^ 

where ^ < \ h\ and where the determinant 

(1) he-fg-\l 



40 CERTAItf PROPERTIES OF FREQUENCY DISTRIBUTIONS 

Starting then with 


( 2 ) 

we have 

(3) 

Next, let 

( 4 ) 

be the frequency function of the new variates u , Then we 
may write^ 

(5) 

Since he~f<j > 0, we know that V is positive through¬ 
out the interval in which we are interested except that \/ =0 
at the end points. From (5) it seems that the new distribution 
function may possibly become infinite when u * o/g , but the 
question then arises as to whether e/g is an admissible value 
of u . 

We shall prove that e/<^ is not an admissible value of tt 

by showing that u cannot take the value within the 

interval ut (f.^i/Ch-gl to </. (e*f) /Cg*h) 

wherein u lies when -1 S jc s 1. In this connection we shaB 

also establish some inequalities that will be found useful in the 

ceesideration of certain properties of the new distribution. Q>n- 

sider first the cases in which 9 + Ar is positive. 

Then since eh > f<) , we have eh + ^9 > fg * «9 . 

Divide by gCq + h) . and we have S. > tf.f. Hence. 

9 9+n ’ 


^ gx-tTT ’ 


Of' 


f~ hu, 
gu-e 


v= ^ (u) 


‘cf. Annals of Mathematics, voL 23, Na 4 (1922), pp. 293-4, 



H, L. RIETZ 


41 


Q/g is too large when /? is positive to be an admissible 
value of uL . 

Consider next the cases in which 9 + /? is negative. In 
this case, / 7 <CJsince g>Q, Hence Then since 

eJ? > fg * we have eh- eg ^ fg- eg . Divide by 

the positive number g{g -h) . This gives — > —^ 

A ® ^ e- f ^ 

and ^ ^ . 

Hence, when 9 ( 9 +/^) < 0, ^fg is too small to be an 
admissible value of u . 

To summarize with > 0, we have shown that: 

(a) When g^h is positive, e/g is too large to be 
an admissible value of u . 

(b) When /? is negative, e/g too small to be 
an admissible value of u. , 

Returning now to the consideration of our frequency function 

Ur (\ , he- fg in (5), we obtain 

T\gu-eJ (gZTe)^ 

( 6 ) ^ df'(tzJm\^2q[ /^-fQ) 

tXu (gu-e)‘* ^ \ gu-e / Cgu-e}-^ rKgu-e/. 

When a takes the valitt (jeb-tf)/Cgb-*-h) into which 
variates at the mode are transformed, we know that 

By making use of the fact that he- fg > 0, and the propo¬ 
sitions (a) and (b) relating to the inadmissibility of e/g as 
a value of u in an examination of the right hand member of 
(6) for a ‘ieb ^ f) / ( g b h) , we establish the 
following proposition in regard to the sign of the derivative 
cL\r /cCu tor the value of it which corresponds to the 

modal value of Jc 

When ^ 0, cLv/<s(u is pasltive or negative 

at u*(eb'^f)/Cgb'¥h) according as is 

positive or negative. 



42 CERTAIN PROPERTIES OF FREQUENCY DISTRIBUTIONS 

The truth of this proposition follows readily by applying 

(а) and (b) to (6), remembering that g is positive and that 
yp" (b) vanishes. 

We shall show next in case <j+h > 0, that dv/du 
is non-negative for all admissible values of 4/ less than 

. To see this from (6), note first 
that ^ lit-hu)/( gu-e)'\ remains non-negative for 
(f-hu)/( gu-e) < or for u less than 

iebi-f)/ (<jb ’th) , and note second that g/( gu-e)^ 

is negative since e/g is too large to be an admissible value 
of u under the condition g-f-b > 0. 

Next, in case ^ 0 , dv/du is non-positive for 

all values of u > (eb-tf) / (gb th) . To see this ftom 

(б) , note first that ^ [Cf-hu)/( qu-e) ] remains 

non-positive for (f-hu)/(gu-e) > b or for a > » 

and note second that g/ (gu^e) ^ is positive when g^h <0 
because in this case « > e /g . 

To summarize, when g'i-h i 0 , we state the 

Theorem L When the derivative cfir/du is 
positive for the value of u into which variates at the 
modal value oc^b transform, then chr/du w 
non-negative for all smaller values of a , Similarly, 
when dy/du is negative for the value of u into 
which variates at the modal value oc- b transform, 
then dv/du is non-positive for all larger values 
of u . 

Finally, we wish to inquire about a modal value for 
the frequency function V (u) in ( 5 ). To this end, 
consider first the case in which cCv/da is positive at 
u. = (cbi-f)/(^bi-h) . At a point between 

44 = (ebtf) / (gb + h) and the upper bound of u , that 
is (e-f f)/(g*h) , a maximum value of y occurs. To 



H. L. RIETZ 


43 


see this, note when ct * (e-f-f)/f /?) that 

dv/du * i/r'(0 (g^h) V Che- fg)^ which is 

negative, or zero since ^ (1) is negative or zero. If it is nega¬ 
tive, there is a maximum where the sign of the continuous first 
derivative changes from positive to negative. If Clv/Ctu is 
zero at (e\f)/ ( h) , it follows also that there 
is at least one maximum of ^(u) between a ^(ei>^f)/(gb^h) 
and 4 ^-fef since 0 at (e^f)/(g^h) 

and y must have changed from an increasing positive function 
at u^( eb-i f)/ (h) to a decreasing function before 

becoming zero at u* (eif)/(g-¥h) . Similarly, it may 
be shown that there is a mode at a value of lc< Cebd)/(gt+h) 
whenever dv/du. is negative at u = {eb ■if)/(gb + fi). 
We may then state the following: 

Theorem II. Given a unimodd continuous positive 
function g - ifr (x) of variates x , with a range 
from -1 to +1, with a mode at , 

with ^ (-1)= ifr (1)=0, and with the derivative ijf JC 
continuous from -A to jc - I, then the frequency 

distribution V- ^ (u) of variates u^fejc-t f)/(gx-th) 

(g > O) has a mode at a value of u > (eJt>’ff)/(gb^h) 
when g 4 h > 0. It has a mode at a vdue of 
CL < (eb-r f)/ (gbih) when ^ 0 . 

Since we have so restricted our transformation 
that the order of corresponding values is preserved, the trans¬ 
formation carries the median of the distribution of jc *s into the 
median of the distribution of u ’s, and we may state the following: 

Corollary. If g - l/f (JC) has its median and 
mode coincident at , the frequency distribution 

V-(f> (u) of (ex^f)/has a 
modd value greater or less than its median according as 
i^ greater or less than sero. 



44 CERTAIN PROPERTIES OF FREQUENCY DISTRIBUTIONS 

Thus far we have imposed the condition <)< \h\ . Let 

us next consider the cases in which and h=Q 

instead of requiring that g < lh\. 0)nsider first the case 
-g . In this case 


( 7 ) 

and 



x-i 


( 8 ) 


eta . he-fa _ _ e+f_ 
ctx (gx+h)^ g(x-l)^ 


Both u and €£u./ dx become infinite as x approaches 
1. Suppose e and f so chosen that u is an increasing 
function of x for the interval -1 S jc< 1 , then u in ( 7 ) is 
an increasing function of OT for the larger interval - 0 o<jc<ll 
and it follows, for the case ^ that is too small to 

be an admissible value of u when -1 « jr < 1, since it is the 
value of u when eo * 

For the case h- g , we have 


( 9 ) 

and 


a » 


ex 


( 10 ) ^ 

cCx 


Since a in ( 9 ) is an increasing continuous function of x 
for the interval -1 < x < oo wherever e and / are so selected 
that it is increasing for the sub-interval -1 < S 1, it follows, 
for g , that g , the value of u, when JiT = oo , 
is too large to be an admissible value of a when -1 < ^ i 1. 
By making use of the fact that e/9 is too smaU or too large 



H, L. RIETZ 


45 


to be an admissible value of u according as or , 

we readily obtain the following results from an examination of 
(6) : The derivative ci\^/ €iu given in (6) is positive at the 
point u= ( ei? t f)/( gb-hh) when h-<j , and it is 
negative at this point when -g . 

Moreover it readily follows as in the case where 9 1 A? | 

that when the derivative eft//eta is positive for the value 

of u into which the modal oc»b transforms, then dv/ c(u 
is non-negative for all smaller values of u , and when dt//eCu 
is negative for the value of u into which the modal value b 
transforms, it is non-positive for all larger values of u . 

Next, for the case h-q , a mode occurs for a value of 
U >(Bb^f)/(gb-hh) , This may be seen by noting that 

as 3 c approaches 1 and as u takes corresponding values 
c£¥^/e£u in (6) approaches the value 16 
which is negative or zero. The analysis given above for the 
corresponding case g < l^f may be applied, with the conclusions 
stated in Theroem II by replacing g-fh > 0 by h-g and 
g-kh by h--g . 

The question very naturally arises as to whether there exists 
a linear fractional transformation ( 9 X f f)/( 
that will transform almost any distribution with the properties 
of y ^ into a new distribution v - (a) with 

a mode at a previously assigned point a^c within the range 
of admissible values ol u . To insure a mode for v* (u) 
at a « c , it is, of course, sufficient that there exist values of 
e * f f Q t and h that make the continuous function 



change sign from positive to negative at u^C . 

Since the only restrictions cm e , ^ , and b w that 



46 CERTAIN PROPERTIES OF FREQUENCY DISTRIBUTIONS 

they shall be real, and that g and he ~ fg shall be positive, 
it seems that the requirement that d v Jdu shall change from 
positive to negative at an assigned value of a could probably 
be satisfied for some important classes of relatively simple func* 
tions. As a simple example, take the quadratic function 
^(x)’ A Bx +C , which, when subjected to the 
conditions on , becomes 5^<Cr>3(l-Jc^)/4. 

The mode is in this case at 0. The protdem we propose 
is to find the linear fractional transformation u ^(€Xi-f)/(qx-^h) 
that will transform i^(x) into ^ (u) with a mode at an 
assigned u»c . In this case (11) becomes 

cL V _ 3 he-fq 

<tu Uhe-fgtf-hu)- 

( 12 ) . 

gnq-^*)a’-t-eu{fh-eg)* e*- f ' 


To facilitate the examination of (12), make 9 . Then 

( 12 ) reduces to 


Since g+h > 0, we have gu-e < 0, and consequently 
the coefficient of ( € 2f-3gu) is positive. To provide for 

the change of sign of (13) at u-C , select e , f , and 9 
so that e+ 2f >> 3cg . To make (13) positive at u=C' 6 
and native at u-c-t 6 , where S is arbitrarily small and 
positive, we may assign to g any positive value and to € any 
value greater than eg , for then f is less than e , which is 
the condition he -fg>Q when h=g . While there are 
thus an infinite number of ways in which we may select a linear 




H. L RIETZ 


47 


fractional transformation so that, when applied to special func¬ 
tions, it will give a new distribution with a mode at an assigned 
point, no general proposition is proved that assures an assigned 
modal value of ^ (x) • 




ON SMALL SAMPLES FROM CERTAIN 
NON-NORMAL UNIVERSES* 


By 

Paul R. Riikk 
Washitigtou University 


INTRODUCTION 


The distribution of the ratio 

Zs tncan of sample-mean of universe 
standard deviadtm of sample 

which is of ^eat importance in the theory of small samples, has 
been derived exactly 1^ theoretical methods for samples of any 
size from a normal universe.^ Experimental studies’ have been 

*The writer desires to express his grateful appredafitm to the National 
Research G>undl, which made possible this study by a grant-in-aid for 
the assistance of a computer. 

’See, for example, R. A. Fisher, Applications of “Student’s” Distribution, 
Metron, vol. 5, Na 3 (Dec. 1, 1925), pp. 90-104. 5 

’e. g. W. A. Shewhart and F. W. Winters, Small Samples—^New Exper¬ 
imental Results, Journal of the American Statistical Association, Vol. 23 
(1928), pp. 144-53; 

J. Neyman and £. S. Pearson, On the Use and Interpretation of Ortaiii 
Test Criteria for Purposes of Statistical Inference. Part I, Biometrika, 
Vol. 20A (1928), pp. 175-240; 

“Sophister,” Discussion of Small Sanq>les Drawn from an Infinite Skew 
Population, Biometrika, Vol. 20A (1928), pp. 389-423; 

£. S. Pearson assisted by N. K. Adyantldiya and others, The Distribution 
of Frequency Constants in Small Samples from Non-normal Symmetrical 
and Skew Populations. 2nd paper, Biometrika, Vol. 21 (1929), pp. 259-86. 



P. R. RIDER 


49 


made of the 2 -distribution for samples of specific sizes from 
other types of universe. A theoretical method applicable to 
samples from a discrete universe was used in a previous paper/ 
in which a rectangular universe was studied in some detail. The 
rectangular universe was chosen as being the simplest from the 
standpoint of the method employed, and as a good example of 
a limited symmetric distribution. It is the purpose of the present 
paper to apply the method to a triangular population, which is 
a specimen of a limited skew distribution, and also to a U-shaped 
universe. The rectangular, triangular and U-shaped universes 
are shown in Table I in the columns headed ^ , T » and U , 
respectively. Their graphs are exhibited in Figure 1. 

In addition to the z -distribution, the distributions of means 
from the triangular and from the U-shaped universe are given. 

In the concluding section is discussed the probability corres¬ 
ponding to an interval of three sample standard deviations on 
each side of the sample mean. 

All of the results of the paper are for samples of four. 


THE DISTRIBUTION OF Z 

The distributions of z are shown in Table II,^ in which 
the distribution for samples from a normal universe, N , is also 
given. 

The cumulated probability of z for the triangular and for 
the U-shaped universe are shown in Table III, which may be 
compared with a similar table for a rectangular and for a normal 
universe given in Biometrika, Vol. 21 (1929), p, 131. 

* P. R. Rider, On the Distribution of the Ratio of Mean to Standard Devia¬ 
tion in Small Samples from Non-normal Universes, Biometrika, Vo! 21 
(1929), pp. 124-143. 

•For an explanation of the method of deriving these distributions see 
Rider, loc. cit. 



so 


SAMPLES FROM NON-NORMAL UNIVERSES 


These cumulated probabilities are plotted on probability paper 
in Figures 2 and 3 and may be compared with similar probabil¬ 
ities for a rectangular universe by reference to Biometrika, Vol. 
21 (1929), p. 129, Figure 2. 

The principal results to be noted are as follows: 

1. The general characteristics of the iS -distribution for 
the U-shaped universe are the same as those for a rectangular 
universe, viz. a greater number of H ’s outside of a certain value 
of jiE j , and also a greater clustering of jg *s about the origin, 
than is the case for a normal universe.^ This is to be expected, 
since the values of for U. and 15, are 1.132 and 1.776 
respectively, as compared with the value 3 for N , 

2. The negative skewness in the triangular universe pro¬ 
duces skewness of the opposite type in Uie distribution of SL , 
as found experimentally by Neyman and E. S. Pearson* and by 
**Sc^)hister/^ This means (in the case of negative skewness 
in the universe) that the probability corresponding to an interval 
from ^ oo to M is smaller than when the sampling is from a 
normal universe. 

3. The cumulated probability of \m \ , or the probability 
corresponding to an interval from -H to Z , is somewhat the 
same for the triangular universe as for a normal universe;^ a 
comparison is made in Table IV. 

Results 2 and 3 are apparently due to the fact that in a 


*See Rider, loc. cit, p. 130. 

•Biometrika, Vol. 20A (1928), p. 198. 

•Biometrika, Vol. 20A (1928), p. 408. 

cf. E. S. Pearson assisted by N K. Adyanthaya and others. The Distribu¬ 
tion of Frequency Constants in Small Samples from Nonstiortndl Sym¬ 
metrical and Skew Populations. 2nd paper, Biometrika, Vol. 21 (1929), 
pp. 259-86. 



P. R. RIDER 


51 


skew universe the regression of variance on mean^ is often essen¬ 
tially linear (if parabolic, the vertex of the parabola is well to 
one side of the scatter diagram). Let us consider the case in 
which the slope of the regression line is positive. Designating 
by jc the difference between the mean of a sample and the 
mean of the universe, and by ^ the standard deviation of the 
sample, we see that large values of jjc] tend to be associated 
with large values of 3^ (and therefore with large values oi 3 )• 
Thus the values of z tend to be smaller. On the other hand, 
for large values of ('•r/, s is smaller a^:d |-j 5 | consequently 
larger. This means that the frequencies corresponding to the 
algebraically lower values of z are greater than in the case of 
a normal universe, or that the use of “Student's” tables would 
give results too small for the probability that the mean of a 
sample does not exceed algebraically the mean of the universe 
by* more than z times the standard deviation of the sample. 
The opposite is true in the case studied here, since the universe 
is negatively skew and the regression line of on jc would 
have a negative slope. 

Since there is a shifting of the whole cumulated z -distribu¬ 
tion to the right or left, the effect noted in 3 is readily explained. 
As a result of this effect we should apparently not be far wrong, 
when sampling from a skew universe, if we used “Student's” 
tables to obtain the probability that the mean of a sample does 
not exceed numerically the mean of the universe by more than 
2 times the standard deviation of the sample.^ 


* For the regression formula see J. Neyman, On the Correlation of the 
Mean and the Variance in Samples from an “Infinite*' Population, Bio- 
metrika, Vol. 18 (1926), pp. 401-13. 

* See E. S. Pearson assisted by N. K. Adyanthaya and others. The Distribu¬ 
tion of Frequency Constants in Small Samples from Non-normal Sym¬ 
metrical and Skew Populations. 2nd paper, Biometrikai Vol, 21 (1929), 
pp. 259-86. 



52 


SAMPLES FROM NON-NORMAL UNIVERSES 


THE DISTRIBUTION OF MEANS OF SAMPLES 

The distributions of means of samples are shown in Tables 
V and \T. In these tables cic indicates the difference between 
the mean of the sample and the mean of the universe. 

For the difficulties involved in obtaining satisfactory results 
for tlie distribution of means of small samples from a U-shaped 
universe see K. J. Holtzinger and A. E. R. Church, “On the 
Means of Samples from a U-shaped Population,’* Biometrika, 
Vol 20A (1928), pp. 361-88. 

The probability corresponding to an interval of three 
sample standard deviations on each side of the sample mean. 

If M is the mean and the standard deviation of a nor¬ 
mally distributed variate X f then, as is well known, the prob¬ 
ability that an item selected at random will lie within the range 
M ^ Z cr is 0.997. If X and S are the mean and the 
standard deviation respectively of a sample, the expected or aver¬ 
age probability corresponding to the interv^al X - 3 s will he 
different from the probability corresponding to the interval 
r/tza Shewhart^ obtained experimentally for the average 
probability for samples of four associated with the interval 
X ^ 3 5 the values 0.90 for normal universe, 0.91 for a 
rectangular universe, and 0.91 for a triangular universe. 

By analyzing all possible samples of four from the rect¬ 
angular and triangular universes of Table I it was possible to ob¬ 
tain the probability corresponding to an interval of 3 6 on either 
<;ide of the sample mean. For example let us consider the 
sample ( 1 , 1. 2, 2), for which X *1.5, 5^0.5. The interval 
X ^ 3 s extends from 0 to 3, This interval includes 0.4 of 
the rectangular universe ^ ; 0.4 then is the probability that an 


’W. A. Shewhart, Note on the Probability Associated with the Error of 
a Single Observation, Journal of Forestry, Vol. 26 (1928) pp. 601-607. 



54 


SAMPLES FROM NON-NORMAL UNIVERSES 


TABLE 1 

Rectangular, Triangular and U-Shaped Universes 


X 

FREQl/SNCY 

e 

T ! 

U 

0 

1 


10 

1 

1 

1 

5 

2 

1 

2 

1 

3 

1 

3 

1 

4 

1 

4 

1 

S 

1 

5 

, 1 

6 

1 

6 

1 

7 

1 

7 

1 

8 

1 

8 

5 

9 

1 

9 

10 

10 


10 


Total 

10 

55 

36 

Mean 

4.5 

7 ' 

4.5 

A* 

0 

0.326 

0 

A* 

1.7^5 

2.36 

1.132+ 


*rhe values of the ^’s are uncorrected for grouping ItK 
dots over the digits indicate repeating decimals. The valuta for 
a continuous rectangular distribution are 0,-o. A-I .8, and 
for a continuous triangular distribution are * 0.32, /Sg*2.4. 



P. R. RIDER S3 

observed value will fall within the interval. Now the particular 
sample (1, 1, 2, 2) would occur 6 times out of 10,000. If we 
take all of the samples for which the interval includes 

0.4 of the rectangular universe we find that such samples occur 
106 times out of 10,000. Such an analysis leads to Table VII, 
from which it is ascertained that the average probability corres¬ 
ponding to an interval of X ^ 3 s is 0.920. A similar analysis 
of the triangular universe T gives us Table VIII and yields 
0.907 as the average probability associated with X * 3 S . A 
better understanding of the situation may be obtained from 
Figure 4. 

^cuJi 'fl. 



P. R. RIDER 


55 


TABLE II 

Probability of z for Samples of 4 


z 

N 

R 

T 

U 

Below -4.25 

.0026 

.0077 

.0015+ 

.0384 

-4.25 to -3.75 

.0011 

.0022 


jxm 

-3.75 to -3.25 

.0018 

.0026 


.0009 

-3.25 to -2.75 

.0032 

.0032 



-2.75 to -2.25 

.0062 

.0074 



-2.25 to -1.75 

.0131 

, .0188 

.0061 

.0106 

-1.75 to -1.25 

.0314 

.0267 

.0251 

.0147 

-1.25 to -0.75 

.0829 

.0692 


.0256 

-0.75 to -0.25 

.2047 

1 

.2000 


.2299 

-0.25 to 0.25 

.3058 

.3244 1 

.3249 j 

.3405+ 

0.25 to 0.75 

.2047 

.2000 

.1741 

.2299 

0.75 to 1.25 

.0829 

.0692 

.0764 

.0256 

1.25 to 1.75 

.0314 

.0267 

.0566 

.0147 

1.75 to 2.25 

.0131 

.0188 


.0106 

2.25 to 2.75 

.0062 

.0074 


.0016 

2.75 to 3.25 

.0032 

.0032 


.0077 

3.25 to 3.75 

.0018 

.0026 


.0009 

375 to 4.25 

.0011 

.0022 


.0004 

Above 4.25 

.0026 

.0077 


.0383 













56 


SAMPLES FROM NON-NORMAL UNIVERSES 


TABLE III 


The cumulated probability of z , or probability that the mean 
of a random sample of 4 will not exceed (in algebraic sense) the 
mean of the universe by more than H times the standard devia¬ 
tion of the sample. 


2 

Cumulated Probability 
Triangular Universe 

Cumulated Probability 
U-Shaped Universe 


for-z 

for 

for - z 

for z 

0.0 

.51955- 

.51955- 

.54355+ 

.54355 + 

.1 

.41649 

.54037 

.39365- 

.60635 + 

.2 

.34497 

.61053 

.34651 

.65349 

.3 

.28885+ 

.65136 

.30739 

.69261 

.4 

.22719 

.70010 

.27831 

.72193 

.5 

.18568 

.74269 

.22081 

.77991 

.6 

.14350- 

.76942 

.14785+ 

.85215- 

.7 

.11580 

.79993 

.11382 

.88618 

.8 

.09485- 

.81086 

.09844 

.90192 

.9 

.07784 

.83462 

.09065+ 

.90935+ 

1.0 

.06130 

.86748 

.08285- 

.91715 + 

1.1 

.05053 

.87456 


.92006 

1.2 

.04256 

.88731 

.07471 

.92529 

1.3 

.03716 

.88731 

.07363 

.92637 

1.4 

.03152 

.90787 

.07179 

.92821 

1.5 

.02783 

.91316 

.06614 

.93387 

1.6 

.02334 

.91911 

.05979 

.94021 

1.7 

.01845- 

.93480 

.05975- 

.94025- 

1.8 

.01552 

.94390 

.05941 

.94059 

1.9 

.01410 

.94390 

.05798 

.94202 

2.0 

.01366 

.94810 

.05441 

.94774 

2.1 

.01265- 

.94810 

.04959 

.95041 

2.2 

.01039 

.95565- 

.04892 

.95108 

2.3 

.00907 

.95565- 

.04892 

.95108 

2.4 

.00871 

.95565- 

.04891 

.95109 

2.5 

.00816 

.95565- 

.04891 

.95118 

2.6 

.00725+ 

.95565- 

.04803 

.95197 

2.7 

.00725+ 

.95565- 

.04732 

.95268 

2.8 

.00661 

.96509 

.04728 

.95272 

2.9 

.00483 

.97910 

.04133 

.95867 

3.0 

.00462 

.98250- 

.03954 

.96046 

3.5 

.00272 

.98250- 

.03904 

.96132 

4.0 

.00242 

.98250- 

.03833 

.96168 

















F. R. RIDER 


57 


TABLE IV 


Ctunulated Probability of |£| for Samples of 4. 


|z| 

greater 

than 

Probability 

l2| 

greater 

than 

Probability 

Triangular 

Universe 

Normal 

Universe 

Triangular 

Universe 

Normal 

Universe 


.9219 

1.0000 

1.6 



.1 

.8761 

.8735+ 

1.7 

.0836 


.2 

.7303 

.7519 

1.8 

.0716 



.6375- 

.6392 

1.9 

.0702 

.0460 

.4 

.5271 

.5382 

2.0 

.0652 

.0405+ 

.5 

.4423 

.4502 

2.1 

.0646 

.0358 

.6 

.3723 

.3751 

2.2 

0547 

.0318 

.7 

.3135- 

.3121 

2.3 

.0534 

.0283 

.8 

.2834 

.2599 

2.4 

.0531 

.0253 

.9 

.2432 

2169 

2.5 

.0525+ 

.0227 

10 

.1891 

.1817 

2.6 

.0516 

.0204 

1.1 

.1755- 

.1528 

2.7 

.0516 

.0185- 

1.2 

.1552 

.1292 

2.8 

.0415+ 

.0167 

1.3 

.1497 

.1098 

2.9 

.0257 

.0152 

1.4 

.1236 

.0938 

3.0 

.0212 

.0138 

1.5 

.1146 

.0805t 














S8 SAMPLES FROM NON-NORMAL UNIVERSES 


TABLE V 


Distribution of Means of Samples of 4 from Triangular Universe 


ac 

Probability 

■1 

Probability 

B 

Pirobabili^ 

-5.25 

.00001 

1 

-2.25 

.01627 

0.75 

.07202 

-5.00 

.00004 

-2.00 

.02200 

1.00 

.06437 

-4.75 

.00009 

-1.75 

.02882 

1.25 

.05496 

-4.50 

.00019 

-1.50 

.03559 

1.50 

.04462 

-4.25 

.00038 

-1.25 

.04501 

1.75 

.03415 + 

-4.00 

.00070 

-1.00 

.05362 

2.00 

.02430 

-3.75 

.00125 

-0.75 

.06187 

2.25 

.01569 

-3.50 

.00212 

-0.50 

.06916 

2.50 

.00881 

-3.25 

.00344 

-0.25 

.07484 

2.75 

.00393 

-3.00 

.00537 

0.00 

.07834 

3.00 

.00109 

-2.75 

.00805- 

0.25 

.07918 



-2.50 

.01165- 

0.50 

.07707 




jc»(vaeaa of sample) - (mean of universe) 














P. R. RIDER 


S9 


TABLE VI 


Distribution of Means of Samples of 4 iToai U-Shaped Universe 



Fre- 


ac 

Fre- ! 

Prob- 

3C 

quency 

BOB 

quency 

ability 

-4.50 

10000 

.0060 

0.25 

106660 

.0635+ 

-4.25 

20000 

.0119 

0.50 

62755 

.0374 

-4.00 

19000 

.0113 

0.75 

51244 

.0305+ 

-3.75 

15000 

0089 

1.00 

49270 

.0293 

-3.50 

14225 

.0085- 

1.25 

48376 

.0288 

-3.25 

15300 

.0091 

1.50 

49505 

.0295- 

-3.00 

16690 

.0099 

1.75 

63960 

.0381 

-2.75 

18140 

.0108 

2.00 

89660 

.0534 

-2.50 

35651 

.0212 

2.25 

81224 

.0484 

-2.25 

81224 

.0484 

2.50 

35651 

.0212 

-2.00 

89660 1 

.0534 

2.75 

18140 

.0108 

-1.75 

63960 

.0381 

3.00 

16690 

.0099 

-1.50 

49505 

.0295- 

3.25 

15300 

.0091 

-1.25 

48376 

.0288 

3.50 

14225 

.0085- 

-1.00 

49270 

.0293 

3.75 

15000 

.0089 

-0.75 

51244 

.0305+ 

4.00 

19000 

.0113 

-0.50 

62755 

.0374 

4.25 

20000 

.0119 

-0.25 : 

106660 

.0635+ 

4.50 

10000 

.0060 

0.00 

__ 1 

146296 

.0871 




Total 

1679616 

1.0001 


JC » (mean of sample) - (mean of universe) 


















60 


SAMPLES FROM NON-NORMAL UNIVERSES 


TABLE VII 


Probability Corresponding to the Interval 
Rectangular Universe 


Proportion 
of universe 
included in 

JC t 3«* 

Number of 
samples for 
which this 
proportion 
occurs** 

0.1 

10 

0.2 

8 

0.3 

84 

0.4 

106 

0.5 

284 

0.6 

324 

0.7 

564 

0.8 

652 

0.9 

888 

1.0 

7080 

Total 

10000 


‘ i. e. the probability corresponding to i’Ss. 

‘ The probability of the occurrence of this proportion is, 
by dividing by 10000. 


Xi3.s 


of course, obtained 



P R RIDER 


a 


TABLE Vlil 

Probabilhy Correspondiing to the Interval 
Triangular Uiiiverse 





Cumulated 


3/55- 
4/55 = 
5/55 = 
6/55 = 
7/55 = 
8/55 = 

9/55 = 

10 / 55 = 
12/55 = 
13/55 = 

14/55 = 
15/55 = 

18/ 55 = 

19/55 = 
20/55 = 

21/55 = 
22/55 = 
24/55 = 

25/55 = 

26/55 = 
27/55 = 
28/55 = 
30/55 = 

33/55 = 
34/55 = 
35/55 = 

36/55 = 
39/55= 

40/55 = 

42/55 = 
44/55 = 

45/55 = 

49/55=a 
52/55 = 

54/55 = 

55/55 = 


Total 


.055- 
.073 
.091 
.109 
.127 
.145- 
.164 
.182 
.218 
.236 
.255- 
.273 
.327 
.345 + 
.364 
.382 
.400 
.436 
.455- 
.473 
.491 
.509 
.545+ 
.600 
.618 
.636 
.655- 
= .709 
= .727 
=■- .764 
= .800 
= .818 
= .891 
= .945 + 
= .982 
= 1.000 


2K.1 

4036 

6993 

ii383 

1280 

7776 

2928 

8762 

12768 

36000 

8640 

2650S 

5400 

32768 

21600 

10584 

U2764 

19698 

71526 

27116 

20C.184 

115128 

37892 

54092 

555924 

57838 

26416 

556520 

774320 

904676 

879564 



91.50625 


.0846 

.0989 

.0961 

.4872 


1.0000 


.5128 

1.0000 


i. c. the probability corresponding to x * 33 


















62 


SAMPLES FROM NON-NORMAL UNIVERSES 



i 1 i J. i i i J. i. 1 i 


XDUdnbdJiJ 



iCousnbsjj 



XDuanbsjj; 


^ o 


Rectangular Universe Triangular Universe U-Shaped Universe 



RiDsi2 


63 



Cumulated Probability of j? —^Triangpular Uni^e-s' 
The curve is foi samples of 4 from a normal univ^^rs'" 
The dots are for samples of 4 from the universe T . 





64 


VERSES 



FIGURE 3 

Cumulated Probahility of je —^U-Shaped Universe 
The curve is for samples of 4 from a normal universe. 
The dots are for samples of 4 from the universe U . 








P. R. RIDER 


65 


■ 

1 

■ 

II 

II 

■ 

■ 

1 

1 


""" , 

■ 

■ 










r 











r 











■ 

■ 

i 

1 

1 

1 

1 

■ 








1 

1 

1 

1 




!■ 

■ 

■I 

■ 

m 

1 

■ 

■ 









— 










* 

FT 











n 

_ 






■ 

■ 

■ 

1 

■ 

1 

i 

■ 

■ 



II 

1 

1 

1 

1 

1 

1 

1 

1 




■ 

m 

1 

1 

1 

1 

1 

r 

1 

i 

_3 




n 



— 


■ 

1 











Hj 











P 

M-mmm 


■ 

1 







1 

W 


1 

1 








• • 



§ 

1 


■s 


p 


ct 

*3 

S 


w> ? 

^ 

fo g 

IX 


u 

ps; 

B 


^ a> 

i| 

’^fs 

^ rt S' 

c ^ 
S’C 

2- 

o'*-* 

B 

£ 






53 ^ 

V 


(Sol 


^ cj 
^ C W 

•S 

1£ ^ 

ll 

£ ^ 
{htS 


u 

o 

I4.r< 

(A 

4.^ 

o 

•o 


Probability Corresponding to the Interval If *3 5 












AN EMPIRICAL DETERMINATION OF THE 
DISTRIBUTION OF MEANS, STANDARD 
DEVIATIONS AND CORRELATION COEFFIC¬ 
IENTS DRAWN FROM RECTANGULAR 
POPULATIONS* 


By 

Hilda Frost Dunlap 

Territorial formal and Training School, Honolulu, Hawaii 


Formulae for the standard errors of means, standard devia¬ 
tions and correlation coefficients have been derived on the as¬ 
sumption of a normal distribution in the sampled population. 
They are said to serve approximately even when the population 
varies considerably from the normal. This paper presents em¬ 
pirical evidence of their applicability in the case of means and 
standard deviations of samples of ten from a rectangular dis¬ 
continuous population, and of correlation coefficients of samples 
of fifty-two from a rank distribution. 

The data for the study of the distribution of means and 
standard deviations were secured by throwing ten dice 1600 times. 

The dice were cubes four-tenths of an inch along an edge 
and numbered on opposite faces 1-6, 2-5, 3-4. They were con¬ 
structed of bone and formed a matched set. 


♦The writer is indebted to Jack W. Dunlap for reading the entire manu¬ 
script and for checking the mechanical computations. 




H. F. DUNLAP 


67 


These were thrown from a cup whose inside diameter was 
1.75 inches and whose dq)th was 2.5 inches. The dice were 
shaken in a box and then cast upon an especially prepared flat 
topped table covered with eight thicknesses of an army blanket. 

As a guard against any possible bias in the table, the dice 
were thrown alternately with the right and left bands. After 
each throw the number of aces, deuces, treys, fours, fives, and 
sixes were recorded, and the mean and standard deviation cal¬ 
culated. In this study each throw was taken as a sampk of ten 
drawn from a population of 16,000. 

The next step was to determine whether there was any sys¬ 
tematic bias in the dice used. The a priori expectation for any 
particular face of the die is one-sixth, here one sixth of 16,000, 
or 2,666?^. This is of the nature of a point binomial of the form 
( p * q with a standard deviation equal to V/V pq 


TABLE I 


Distribution of Observed and Theoretical Populations with a 
Test of the Difference of Their Standard Deviations 


Die 

Face 

(H>served 

Frequency 

Expected 

Frequency 

Difference 

1 

2726 

2666% 

59% 

2 

2653 

2666% 

14% 

3 

2671 

2666% 

4% 

4 

2763 

2666% 

96% 

S 

2650 

2666% 

17% 

6 

2537 

2666% 

130% 


<r»C16(X).l/6.S/6)^ -47.1 *70.8 

S-<7-23.7 ±13.76 














68 


AN EMPIRICAL DETERMINATION 


Table I gives the observed and expected values of each face. 
The standard deviation of the differences was determined and 
compared with the standard deviation of the expected distribu¬ 
tion and the probable error of this difference was found. 

Small S is used here to denote a standard deviation of a 
sample, while represents the standard deviation of the the¬ 
oretical or true population. The formula for the standard devia¬ 
tion of a. difference is 

and in particular 

The second term drops out here because it is the standard 
deviation of the true standard error and this is equal to zero. The 
third term drops out for the same reason. Table I shows that 
the difference between the obtained and expected standard devia¬ 
tions is 23,7 t 13.76. As this is less than twice its probable error, 
it can be concluded that the difference is not significant and that 
there is no significant bias in the dice. 


MEANS 

Figure 1 shows the distribution of the 1600 observed means, 
a normal curve for N ^ 1600 is superimposed on the histogram. 
For this distribution 

r.cm ) * .0160 ±.0413, indicating symmetry 
^2 ( - 3 ) - -.1050 ± .0826, indicating mesokurtosis 

whence we may conclude that the normal curve represents this 



FIGURE 1 

Distribu6on of 1600 means of samples of ten, with fitted notmal curve. 


H. F. DUNLAP 





70 


AN EMPIRICAL DETERMINATION 


distribution adequately. 

The curves of this and succeeding figures were drawn 
through points calculated at intervals of ^ % except that in 
the case of Figures 2 and 3, paints beyond ± 2 <7 were calculated 
at intervals of 1 cr. 

The values of the observed means varied from 1.6 to 5.4, a 
range of 6.9129 standard deviations. 

The basic information to be drawn from this study of the 
distribution of 1600 means of samples of ten is given in Table IL 
The table is interpreted as follows: 

The mean of the sampled population (16,000) is 3.47306, 
while the theoretical mean of the infinite population is 3.500000. 
The standard deviation of the sampled population (16,000) is 
i.6788, and of the theoretical population 1.7078. The standard 
error of the mean of the sampled population is .0133. In com* 
paring the mean of the sampled population with the mean of the 
theoretical infinite population, the former is treated as an ex* 
pcrimental value whose standard error can be estimated, while 
the latter, being a true value, has no error. 

The standard deviation of the difference between the means 
(theoretical population) and S (sampled population) is 


The first and third terms drop out because equals ara 
The difference between the mean of the theoretical population 
and the sampled population is .02694 1 .00897, from which it can 
be concluded that the mean tends to vary from the true mean. 

3r will hereafter refer to the mean of a sample of ten. The 
best estimate of the mean of a sample of ten that can be made 
for any sample chosen at random from the sampled population 
















n 


AN EMPIRICAL DETERMINATION 


is 3.47306, and from the infinite population, 3.5000. 

The standard deviation of the means of 1600 samples is 
.5467, while the estimated value for a sample picked at random 
from the sampled population is .5372 and from the theoretical 
infinite population .5401. These last two values are calculated 
by the formula 

The best estimate of the standard deviation of a sample of 
ten picked at random from the sampled population is the <r 
of the sampled population, 1.6788, or of the theoretical infinite 
population, 1.7078, whence the values in the tables are obtained. 

The standard error of the standard deviation of the means 
of samples is .0097. The standard error of the standard error 
of the mean of a sample of ten from the sampled and theo¬ 
retical infinite populations is zero, as these are true values. 

The difference between the standard deviation of the means 
and the standard error of such means of samples of ten from 
the sampled population or the theoretical infinite population is 
.0125 t.0065. Thus there is no significant difference between 
the value of <r^ when calculated by the formula ^ 
and an actual distribution when samples as small as ten are used. 

indicates, as pointed out above, that the distribution is 
not skewed, while shows the distribution to be slightly peaked 
but not significantly so. 


STANDARD DEVIATIONS 

Figure 2 shows a histogram and a fitted Gram-Charlier Type 
A curve, of the distribution of 1600 standard deviations of 
samples of ten calculated by the formula 

X being measured frwn the mean. Sc . 

Figure 3 shows a similar histogram and curve fitted to the 



FIGURE 2 

Distribution an<l fitted Gram-Charlier curve of 1600 standard deviations of samples of ten, calciflated 

by the formula S • ( 


H. F. DUNLAP 


73 



74 


AN EMPIRICAL DETERMINATION 



FIGURE 3 

Distribution and fitted Gram-Charlier curve of 1600 standard deviations of samples of ten, calculated 

by the formula 


H. F. DUNLAP 


75 


TABLE III 

Distribution of 1600 Standard Deviations of Samples of Ten 


Deseription 

Observed Value 

Theoretical Value 



Qj 

Sampled Infinite 
Population Population 

j? of «’s of sam. 

1.5869 

2.0403 

1.6988 

1.7078 

S. D. of 5 ’s of satn. 

S. D. of of s’s of 

.2665 

2538 

.3799 

.3818 

samples 

S, D. of s of s’s of 

.0067 

.0063 

.0000 

.0000 

samples 

.0047 

.0045 

.0000 

mo 


.1119 

.3415 

.0000 

.0000 


i0045or 

±.0042 or 




.1209 

.3325 




±0045 

±.0042 i 



<7*5 
o s 

.1134 

.1261 

.0000 

.0000 


±.0032 or 

±.0030 or 




.1153 

.1280 




:k0032 

±0030 



(skewness) 

-t3568 

-.5026 

.0000 (normal 


±0413 

±0413 


theory) 

7^ (kurtosis) 

.5140 

.6851 

.0000 (normal 


±0826 

±0826 


theory) 

N 

1600 i 

1600 ‘ 










76 


AN EMPIRICAL DETERMINATION 


same data when the standard deviations are calculated by the 
formula 



A study of this latter formula is included here to test which 
is more appropriate when dealing with small samples from a 
rectangular population. 

An interpretation of Table III is now in order. Column one 
is a description of the statistics involved. Column two i s sub¬ 
divided into two parts: First, w hen s equals , and 

second when S equals y j . Column three gives the theo¬ 
retical values. There are two of these—one for the sampled 
population and one for the infinite population. In the case of 
the sampled population the values calculated for the standard 
deviation and the become true values when a single sample 
is compared with them in exactly the same manner as if com¬ 
pared with similar values from the infinite population. The reason 
for this is that for a given sample the 16,(XX) constitutes the actual 
population from which the sample is drawn. 

In the first line the means of the standard deviations of the 
samples are found to equal respectively, 1.5869 and 2.0403. The 
theoretical means for the sampled and infinite populations are 
respectively 1.6988 and 1,7078. 

In the next line are the standard deviations of standard 
deviations of samples. These are calculated values, obtained by 
substituting in the formula 

As the best estimate of the standard deviations of any particular 
sample chosen at random is the standard deviation of the sampled 
population, or the infinite population, these values can be sub¬ 
stituted in the above formula in obtaining the standard error of 
the standard deviation of such a sample of ten. 

The standard error of the mean of standard deviations in 



H. F, DUNLAP 


77 


samples for both observed values is given in line three. Obviously 
in the case of the sampled and infinite populations these equal 
zero. It should be clearly understood by the reader that here N 
equals 1600, the number of standard deviations used in deter¬ 
mining the mean standard deviation. 

Line four gives the standard error of the standard deviation 
of standard deviations of samples of ten. 

Line five gives the difference between each of the true stand¬ 
ard deviations (sampled and infinite) and the two observed mean 
standard deviations. The standard deviations of the sampled 
pojxilation and of the infinite population are each greater than the 
mean standard deviation of the ob served population when calcu¬ 
lated by the formula the first case the differ¬ 

ence is ,1119 3: .0045. This is approximately 25 times its prob¬ 
able error, so it must be considered a significant difference. The 
difference when compared with the theoretical infinite population 
is .1209±.0045. This is even more significant. When the 


theoretical values are compared with t he mean standard deviation 


calculated by the formula s 

to be .3415 ±.0042, and .3325 ±.0042. 


the differences are found 
The differences here are 


much greater than those found from the first formula. 


Line six shows the difference between the standard errors 


of the standard deviations of the true populations and the cal¬ 


culated of the samples. The difference between and 
Sj (.3799 -.2665), is .1134±.0032. This difference is approx¬ 
imately 35 times its probable error. The difference between .3799 
and .2538 is even greater. Still la rger di fferences are found 
when is calculated for the formula. 

if in the case of both curves is negative and more than 


8 times its probable error, definitely showing a negative skewness, 
7e in the case of both curves is 6 times greater than its prob¬ 
able error, indicating definite leptokurtosis. The Gram-Charlier 
curves shown in Figures 2 and 3 were fitted to the first four 



7S 


AN EMPIRICAL DETERMINATION 


moments according to the equation 

where «3C“ 

If we compute values of s by the empirical formula 
, the mean value is 1.7039, which lies very close to 
the theoretical values 1.6988 and 1.7078, in fact almost exactly 
half-way between them. 


CORRELATION COEFFICIENTS 

The product-moment correlation coefficient varies between 
the limits plus one and minus one. Obviously, the distribution 
of correlation coefficients cannot be normal, although in the case 
where r*0 their distribution should approximate a normal 
curve, as it can become sj'mmetrical. Coefficients around any 
other point tend to be distributed asymmetrically. 

It was assumed that if a deck of cards be thoroughly shuffled 
there should be no correlation between successive deals. Using 
a deck of cards gives a sample of 52. A new pack was 
thoroughly shuffled. The cards were then dealt one at a tkne, 
the first card dealt being recorded as number one, the second 
card dealt as number two, the third card as number three, etc. 
That is, if the seven of hearts was turned first, the value one 
wasf recorded against its place in the table. After each deal the 
cards were picked up in the same order and shuffled three times 
by the fan method and then cut twice. Sixty such deals were 
made and recorded. Then rank correlations were calculated be- 



FIGURE 4 

Distribution of 1770 correlation coefficients of samples of 52, with fitted normal curve. 


IL P. DUNLAP 



80 


AM EMPIRICAL DETERMINATION 


tween each pair of deals, the total number of intercorrelations 
being ^£J2 , here 1770. 

In this study, there could be no split ranks. Each card could 
receive one and only one rank on each deal. Thus, the rank 
correlation formula gave exactly the same values as would a 
Pearson product-moment coefficient. 

Figure 4 shows a histogram with a fitted normal curve super¬ 
imposed on it. for this curve is .000015- .0392, indicating 
no skewness, and is .2174 ± .0785, indicating a slight ten¬ 
dency to peakedness. Both of these facts are shown by the fit 
of the curve to the histogram. 

The formula for the standard error of a cofrelation coefficient 

from a normal population is ^ 

O' 

^ yfT 

p being the correlation in the population. Thus when r * .0000 
and A/= 52, a>».1387. 


The mean value of the 1770 coefficients is r»-.0012. The 
expected mean is zero. The difference between these two values 
is .0012 t.0022. This shows that the mean correlation coefficient 
is not significantly different from the expected mean correlation. 

The standard deviation of the observed distribution is .1359. 
This value differs from the expected value by .0028 - .0091. The 
formula therefore seen to give a sufficiently close 

approximation in this case. 


CONCLUSIONS 

1. The distribution of means of samples of ten drawn from 

a discontinuous rectangular population is normal. The formula 
<T£ ^ reasonably close estimate of the standard 

error of such means. 

2. The distribution of standard deviations of samples of 



H. F. DUNLAP 


81 


ten drawn from a discontinuous rectangular population is skewed 
and leptokurtic. The formula ^ 

sonably close estimate of the standard deviation of standard 
deviations of samples o f ten, whether the latter are computed 
from the formula s or s 

3. Neither of the formulas, and 5^ 




for the standard deviation of a sample of ten gives a reasonably 
close estimate of the true standard deviation in a rect ai^uly d is> 
continuous population. The empirical formula 3 
does appear to do so. 

4. The distribution of correlation coefficients of samples 
of 52 from a rank population in which the expected correlation 
i« zero, is symmerrical and very slightly leptokurtic. The formula 

/ 5 . . 

represents adequately the standard deviation of 
such correlation coefficients. 




EDITORIAL 


The Interdependence of Sampling and Frequency 
Distribution Theory 

The object of the theory of sampling is to describe tfie phe- 
ncMnena exhibited by all the samples that can possibly arise from 
a parent population of known characteristics. In some cases the 
desired description can be obtained directly by employing elemen¬ 
tary operations of combination theory, in others it is either ex¬ 
pedient or necessary to use the indirect attack of the statistical 
theory of sampling. These two methods are quite different in 
application, and it is advisable to illustrate the respective peculi¬ 
arities of the two methods. 

Example 1. An s^uction bridge hand may be regarded as a 
single sample withdrawn from a parent population of 52 cards. 
The number of different hands that can be selected equals the 
number of combinations of 52 things taken 13 at a time, namely, 
(/,-)= 635 013 559 60a Of these 

( 1 ). 

will contain exactly Z cards of any specified suit. Therefore if 
in this expression we successively place 2 equal to 0, 1, 2, . .. 13 
we shall obtain the frequency of all possible samples ranked ac- 
wrding to the number of cards of the specified suit contained in 
eadi sample. The results are presented in the following table. 



EDITORIAL 


83 


TABLE I 


H 

f<Z) 

f(z)/N 

0 

8 122 425 444 

.01279 

1 

50 840 366 668 

.08006 

2 

130 732 371 432 

.20587 

3 

181 823 183 256 

.28633 

4 

151 519 319 380 

.23861 

5 

79 181 063 676 

.12469 

6 

26 393 687 892 

.04156 

7 

5 598 661 068 

•00882 

8 

740 999 259 

•00117 

9 

58 809 465 

.00009 

10 

2 613 754 

.00000 

11 

57 798 

.00000 

12 

507 

.00000 

13 

1 

.00000 

Total 

635 013 559 600 

.99999 


In this illustration, combination theory has yielded a perfect 
solution. The frquencies are exact, and the sum of the fre¬ 
quencies between any two limits may likewise be obtained exactly 
by a simple addition. 

Example 2. The bidding strength of hands in auction bridge 
is often approximated by counting each Jack, Queen, King and 
Ace as 1, 2, 3 and 4 points, respectively. The total count of a 
single hand may range, therefore from 0 to 37 inclusive. Re¬ 
quired the frequency distribution of all possible hands when they 



84 


EDITORIAL 


are classified according to count. 

Unlike the preceding problem, we cannot obtain a simple 
expression for the general term, , of the required distribution. 
But after rather involved computations the following solution 
may be obtained: 


TABLE 11 


^3 

Frequency 

. /CZ) .. _ 

m^Qiii 

Frequency 

A St) 

0 

2 310 789 600 

19 

6 579 838 440 

1 

5 006 710 800 

20 

4 086 538 404 

2 

8 611 542 576 

21 

2 399 507 844 

3 

15 636 342 960 

22 

1 333 800 036 

4 

24 419 055 136 

23 

710 603 628 

5 

32 933 031 040 

24 

354 993 864 

6 

41 619 399 184 

25 

167 819 892 

7 

50 979 441 968 

26 

74 095 248 

8 

56 466 608 128 

27 

31 157 940 

9 

59 413 313 872 

28 

11 790 760 

10 

59 723 754 816 

29 

4 236 588 

11 

56 799 933 520 

30 

1 396 068 

12 

50 971 682 080 

31 

388 196 

13 

43 906 944 752 

32 

109 156 

14 

36 153 374 224 

33 

22 360 

15 

28 090 962 724 

34 

4 484 

16 

21 024 781 756 

35 

624 

17 

14 997 080 848 

36 

60 

18 

10 192 504 020 

37 

4 



Total 

635 013 559 600 














EDITORIAL 


85 


Example 3. If the mean and the standard deviation of the 
weights of a group of 200,000 men be 140 lbs, and 20 lbs., re¬ 
spectively, and if in addition it be known that the higher standard 
moments of this distribution be 


3^17 <e:x^l7.97, 


what is the chance that the mean weight of 1000 men chosen at 
random from the 200,000 will exceed 141 pounds? 

It is clear that it would be physically impossible to solve this 
problem by employing a direct attack by combination theory, even 
though the weights of each of the 200,000 men were available. 
Moreover, it is likewise evident that in statistical problems cor¬ 
responding to the illustrations of examples 1 and 2, the number 
of individuals in both the parent population and each sample is 
considerably larger than 52 and 13 respectively, ana consequently 
the calculation of either a single frequency or the sum of any 
large group of consecutive frequencies by the direct niethcK:! is 
quite out of the question. 

Let us now consider the three examples above from the point 
of view of the indirect attack. The parent populations for the 
first two examples may be interpreted as 


Variates 

Frequencies 


and 


X 0 1 

ffx) 39 13 


Variates . . nc . . 0 1 2 3 4 

Frequencies . . ffyc)- . 36 4 4 4 4 


respectively. 



86 


EDITORIAL 


For the first, the mean is at oc = l/4, and the moments 
about the mean of the parent population are obviously 

For the second, the mean is at Jc = lCy^l3, and corres¬ 
pondingly the moments of this parent population are 

y r » n n » rr} 

J 

If 8 and r denote the number of individuals in the parent 
population and each sample respectively, then the moments of the 
distribution of ail samples that can arise from this parent popula¬ 
tion may be obtained from those of the parent population by 
means of the relations 

r 

( 2 ) 1 ‘ ^(pr*^Pz 

"^*P* -''«>/>*) 

V, ^ * «/»*> 

^ ^ - ^P* * ^Ps VJ,) 



EDITORIAL 


87 


where 


^ ^ ' t o i factors 

' to i factors 

Since the moments x these three examples 

are now known, and according to the conditions of the problems 
the values of ( r , a) are (13, 52), (13, 52), and (1000, 200000) 
respectively, it follows that the moments of the desired distribu- 
tions of samples are as follows: 


Function 

Example 1 

Example 2 

Example 3 


13/4 

10 

A7, -140 lbs. 


507/272 

29Q/17 

. .630874 lbs. 


6S91/1.4600 

28f5/17 

.0156927 


S3591421/5331200 

17441114/29155 

3.0001357 


9339447/1066240 

2262240/833 j 

•6,.*- .1569051 


71781908037/801812480 

2684384074/59151 ! 

153)26638 


It will be observed that the indirect procedure has yielded 
the moments of the required distributions rather than their fre¬ 
quency functions, and the next step therefore is to obtain with the 
aid of these moments approximate expressions for the desired 
frequency functions. In this connection it should be borne in 
mind that we arc not concerned with questions regarding the 
probable errors of the moments which we are employing, since 
the moments computed for the distributions of samples are neces¬ 
sarily exact, and their probable errors are therefore zero. For 


iSee Annals, Vol. I, page 104. 


















EDITORIAL 


this reason arguments tending to limit the number of terms that 
may be employed in either a Gram-Charlier series, or in the de¬ 
nominator of Pearson’s differential equation are not to the point 
so far as our illustrations are concerned. These remarks hold 
even for the third example, since if the moments of the parent 
population are as given, then the moments of the distribution of 
samples may be determined with any desired degree of accuracy. 

Since it is evident that the solution of our problems now 
depends upon our obtaining approximate expressions for these 
distributions whose moments are known, we shall at this point 
develop a general method of representing discrete distributions 
which is essentially due to the researches of Charlier. Although 
the results that we shall obtain are practically those that have 
also been obtained by Gram. Edgeworth and others, the method 
that we shall employ is that used by Charlier in **Die Strenge 
Form des BernoulHschen Theorems.” 

Let f(^) be the frequency function for a discrete dis¬ 
tribution ranging from oc« fo **^*■'2 • If the ordinates 
be equidistant at intervals of h , the total frequency of the dis¬ 
tribution is 


f(=c). 


where our interest is focused on a typical ordinate at 
If we now set up the function 






EDITORIAL 


» 

vkhere i . and multiply each side by e so that 






fUMfC..!,): ‘“I 


•tfA\ JT . , -ttwi 

•fUj'e +fCx,-2A}e 


«-JC^ 


\\e obtain by integrating both members with respect to be¬ 
tween the limits and to® ? 

n n 




since the int^ral of all other terms of the right hand member will 
vanish as follows: 

I 

/„ f • fCx„*mh) * 


-it 


4( 


CQ9 mha^4‘i /nAtt?! cfct^»C 


] 


( mr is an integer.) 

It foliows dierefore that 




90 


EDITORIAL 


Moreover, since 


-buii 

1-e ♦■•+« 


-Cb*h)w -atoi 
e _ zS. _ 


we see that the sum of all the consecutive frequencies from 
to may be expressed as the definite integral 



The changing of the order of integraticm is permitted since the 
limits are all finite. 

Ordinarily frequency distributions are expressed as develop¬ 
ments of the integral (4), and the sums of consecutive frequen¬ 
cies obtained by applying the Euler-Maclaurin Sum-Formula to 
these results. It seems at first sight that it might be well to place 
a little more emphasis upon the evaluation of (5), since this as 
it stands affords an exact expression for the sum of any group 
of consecutive frequencies. For the case of continuous variates 
we need only permit h to approach zero, replace the sign of 
summation by the sign of int^ration, etc., and after justifying 
the change in the order of integration for the resulting infinite 
limits obtain 








•e 


jcwi 


) 





EDITORIAL 


91 


We shall now attempt to evaluate the definite integral (4). 
Let us first observe that the quantity within the parenthesis is a 
function of co > since the finite integration with respect to x 
and the subsequent replacing of ac by the limits will cause this 
distribution variable to disappear. 

For reasons which will develop later, let us write 


If in Leibnitz' formula 


we i^ce u^e ^ and e > ^nd note that 


Jj € 


B^O 


D 


Bn 


€ 



then 






*r-e. a _ 


a e!^ >h6 s 


where 


n^‘^*n(n-))(n-Z) • • • . . to / factors- 



92 


EDITORIAL 


Thus we may write 



and employing the notation 

(x-b.)f(x)- Ny„ 

we obtain from (7) 

(8) I -A, 


L 




«o 



Formula (4) may therefore be written, dropping the sub¬ 
script on 


( 9 ) 


(S^* - j 



EDITORIAL 


Placing 


( 10 ) 



e ^ <iu) 


it feltows tliat the nth derivative with respect to ac is 


(H) 


frti 


e « 


^tr 






sofinslly 


( 12 ) j(x)»N'h [^ 6 ^- - 


Let us now investigate the function &(x), 

QOe) ^**** 

~h 

-< nrf(x j rfft) 

s J. f ** e ^^cos(x-b,)u>€i*^9 



94 


EDITORIAL 


^ since <8 


-62<*>/s -b ,)€0 is an odd function of cij3 

^C 09 (x-b,)eo cita 

O 

t /■*• -A 

~nl e * ccsCjc -t>,)to<ito 


'ar 

h 




- 0fx;- . 




e I ^ * cos{x-bJtoct<^ 


£ 

h 


/ 


-‘Kc, e. 


‘m 


y4«»l 


Likewise we may write 


OM-ip 



T6,«y« 

Ct«J 


By successive integration by parts it can be shown that 


( 13 ) + in-ltn-^x” ^+ ' -t(n-i][n‘y)-^-(n'2i^i)x , 

*2 

^i’(n-tXrt-3) "• (7T-2i-tl) * dbe 



EDITORIAL 


95 


SO we have that 




/jr\* 




n-/ 





So far we have said nothing concerning the values of the 
parameters b, and b^ . Referring to formula (8) it is seen 
that if the origin of jc be taken at the mean of ^ distribution 
in question, and equal the second moment about the mean 
of this distribution, c, * c^*0, and consequently if the 
values of may be neglected, the equation of the distribution 
expressed in standard units becomes 


(15) A>‘sf w- s'A* j 

.here ^ * • “xl 

( 16 ) ; 

Ac • 






“ ■ - . 
n-* 2 *. 5 / 


'ir-6 



96 


EDITORIAL 


By employing the Euler-Maclaurin Sum-Formula we can 
write 


ffa) + f(a+h) * ■ + f(b-h)*f(b) 


(17) 


a 


wnere 



<r ^6 * tr* ' 7£ 

] 

\ a' ^ h ^s-ZO-ca 

7ZO er £40 or* 2*8 30240 flr‘ 

In some cases it may be more convenient to employ a mean 
and a standard deviation of the generating function that differs 
somewhat from, that of the distribution for which the representa¬ 
tion is desired. In this event the coefRcients of the first and second 
derivatives in (IS) will not vanish. However, the extra effort 



EDITORIAL 


97 


expended in increasing die number of significant terms may be 
more than offset by the fact that a rather arbitrary choice in the 
values of />, and may result in simpler values for 


t” 

which in turn may occasionally eliminate difficult interpolations 
ndien dealing with tabulations of the generating function and its 
derivatives. 

Formulae (17) and (18) may be regarded as a sort of apol¬ 
ogy for the fact that the definite integral of formula (5) has 
never been developed. The need of a satisfactory expression for 
the sum of any number of consecutive variates is indeed acute. 

By permitting b in the forgoing theory to approach rcro, 
one can obtain corresponding formulae for the ordinates and 
areas of distributions of continuous variates. However, it should 
be noted that for this case the limits for the integrals in the 
vicinity of formula (4) are now 


x-b, 

7 ^ 


h’-O 


h 


and consequently the changing of the order of integration must 
be justified. 

In conclusion we may state: 

I, Answers to problems of statistical sampling are usually 
expressed as finite or infinitesimal integrals under a function 
whose moments only are known. If known, the function is gen¬ 
erally of but little value, 

II. It is necessary to approximate the desired integrals by 
employing frequency functions. 



9S 


EDITORIAL 


III. Present methods are unsatisfactory from the point of 
view that remainder or limit of error terms are not available. 
The X test, though helpful, does not meet the issue in question. 




NOTE ON THE DISTRIBUTION OF MEANS 
OF SAMPLES OF N DRAWN FROM 
A TYPE A POPULATION 


By 

Cecil C. Ceaig 
NaHonal Research Fellow 


Recently in this journal, Dr. George A. Baker has found 
“the distribution of the means of samples drawn at random from 
a population rqiresented by a Gram>QiarUer series.”^ It is die 
purpose of this note to call attention to the fact that the use 
of the semi-invariant notation Dr. Baker's results may be reached 
in very many fewer steps. 

Let the parent population be represented by 


(1) ffa). (p(^ [a ^ w, IJ 

in which 

/ - 

( 2 ) 


iVol. 1, No. 3 (Aug., 1930), pp. mZM. 



100 NOTE ON THE DISTRIBUTION OP MEANS 
the origin for oc being chosen at the mean, and 


( 3 ) - dJ'Cc'^). 

We shall first find the distribution function oi Z“X,*-Xf* 
in which oc^ , i - /, 2. • • • /V , has the frequency function 
f (<ac). Let us assume the frequency function of z is given by 


( 4 ) 


F(z)^ (p(z) 







+ • • • • f- 




Then the semi-invariants of f (or), X /, 
are defined by the forma! identity in t : 

(5) c ■♦• • • V Jdx A, *0 in this .case) 


and on integration, using (3), we get at once on the rig^it: 


A.lf 

•a 


Similarly for the semi-invariants * Lj’’* (2) 

we have 




hAgi^A^t— +(■ 


i j 


But because of the well-known fact that « /V X^ tWs 
gives 



C. C. CRAIG 


101 


•^\j~a^t^*a^t‘* -^ * J 

an identity in t, Thus 


(?) A^-Z 


__ 

V, !VJ -V^f (NA)^-^ • •• M.)! 



■ a 


IC 


the summation including all terms for which 
* ■ > • ^ k\/^ ^ r 

Remembering that “•j/FT Ojc > ** <* 

substitution in (4) the expression ior f (m) since only a finite 
number of Ay.’s (depending on /V) are difierent from sero. 

To get the distribution of m • •*» * ** * ’' * f only 
involves the appropriate change of unit. 


x <gL ^ . J yCgL 



SUiiford Unmrsity. 




ON SYMMETRIC FUNCTIONS 
AND SYMMETRIC FUNCTIONS OF 
SYMMETRIC FUNCTIONS* 


By 

A. L. O’Toole 


INTRODUCTION 

The study of symmetric functions is quite an old one. From 
the time of Girard (1629) even up to the present day this sub¬ 
ject has occupied the attention of many eminent mathematicians. 
The theory of the roots of algebraic equations in one or more 
variables has furnished the chief incentive for the development 
of the theory of symmetric functions. Ingenious methods for 
computing symmetric functions in terms of what are called the 
elementary symmetric functions have been developed by Ham¬ 
mond, Brioschi, Junker, Dresden and others. Extensive tables 
of symmetric functions in terms of the elementary symmetric 
functions may be found in the literature. 

Symmetric functions play such a pre-eminent role in the 
mathematical theory of statistics and their computation by direct 
methods or by general formulas, even when assumptions restrict¬ 
ing the groupings of the variates about the various means are 
made, is so excessively tedious that there has seemed to be need 

*A dissertation submitted in partial fulfillment of the requirements for the 
degree of Doctor of Philosophy in the University of Michigan. 




A. L O'TOOLE 


103 


of development of the theory of symmetric functions in direc¬ 
tions not sugc^ested by the theory of equations. The ingenious 
methods referred to above are of little or no practical value in 
statistics; for they express a symmetric function in terms of the 
elementary symmetric functions whilst here it is necessary^ to 
express the symmetric function in terms of w^hat are called the 
potvcr sums. Likewise, and for the same reason, the tables men¬ 
tioned are of no value to the student of statistics. 

Moreover, in the theory of sampling one not only has to 
deal with symmetric functions of the given variates but with 
symmetric functions of symmetric functions of the given vari¬ 
ates. This then leads to interesting as well as practical develop¬ 
ments in the theory of symmetric functions. 

In this investigation it is proposed to: 

1. Develop symbolic methods which will enable one to 
express any given symmetric function in terms of the power 
sums, without knowing the expressions for the symmetric func¬ 
tions of lower weight, and which will also lend themselves readily 
to the construction of tables; 

2. Develop symbolic devices in the more general case of a 
symmetric function of symmetric functions- 



104 


ON SYMMETRIC FUNCTIONS 


CHAPTER I 


Direct Computation 

1. Suppose there is given a set of 77 variates^ or, . • 
f oc ^, , . . y no assumptions whatever being made 
as to their arrangement about, the various means. Any rational, 
integral, algebraic function of these /? variates which is un¬ 
altered by interchanges or permutations of the variates is called 
a symfiietric function. With a few modifications, the usual no¬ 
tation for symmetric functions will be used in this investigation. 
Tht power sums s, , s«, s,, . . . : 

Let 

n 

Z “ '^1 * * . * ’ 

i*/ 

. * < > 

n 

s = 7 jc?- + oc® .^ , 

4 I a it » 

/=/ 


5^-7 car ^.. 

Furtaer, let ( cC* b^c.^ ^ • -) represent any symmetric 
^The variates may be either real or complex numbers. 






A. L. OrOOLE 


( 


function of the given variates. In other words, let ( '•) 

equal the sum of all the terms such as 


CL Cl 
^1 * 


JC ^OC 


b 

t 


cc 


b 

^ ^2. 


b c 

'^0«r ’t‘/3 ■*' / 




which can be formed from the r/ variates, where a , b, c , . . 
and >S, y, . . . are positive integers and a > b ^ c > 

• ^ O . e. g. 

(3 <z?XjX^x^ , 

j-i 

k^t 

rn-f 


Definitions : 

A partition of a positive integer t is any set of positive 
integers whose sum is t . The integers which constitute the par¬ 
tition are called the parts of the partition and are enclosed in 
parentheses ( ). It is desirable to arrange the parts in descend¬ 

ing order of magnitude from left to right. Obviously then- for 
any finite positive integer t each partition of t contains a finite 
number of parts. If there are r parts in the partition of t 
then the partition is called an r-part partition of t or simply 
an r-partition of t, E. G. (33), (321), (3111) are respectively 
2-part, 3-part and 4-part partitions of 6. When repeated parts 
appear in the partition it is customary to write one of the re¬ 
peated parts with an index corresponding to the number of times 
that part is repeated- Thus (33) is written (3^) and (3111) is 
written (31^). The number t is called the weight of the par¬ 
tition. For a discussion of the formulae for finding the number 
of partitions of an integer the reader is referred to Whitworth's 



106 


ON SYMMETRIC FUNCTIONS 


‘‘Choice and Chance/'^ 

It will now be clear that the notation introduced for the <^cn- 
eral symmetric function is a partition notation The weight of 
a symmetric function is the degree in all the variates of any term 
in the summation. The order of a symmetric function ir the 
highest degree in which each variate appears in the summation. 
For instance, in (^-32) the weight is 

4 + 3 + 2 - 9 and the order* is 4. It follows that in the partition 
notation of a symmetric function the weight is given by 
6/3.c/.. - . . and the order by ez . In the par¬ 
tition notation the power sums become simply (1), (2), (3), 
t) respectively. 

For the purpose of mathematical statistics, moments rather 
than the power sums are the important thing. However, the 
transformation from power sums to moments is so simple that 
the results of this investigation in terms of power sums may be 
written in terms of the moments by putting 

= S, , 

, 

where are the 

Stallsrical moments of the Y) variates. 

2. It is not difficult to express certain symmetric functions 
in terms of the power sums. Practically all texts in higher al¬ 
gebra devote a section or two to this problem. Most of those 
which develop general formulae do so by using the properties of 
the coefficients of an algebraic equation. However, many others 
have developed general formulae in symmetric functions without 


A. Whlt^^orth, ‘‘Choice and Chance,” G. E. Stechert and Co., N. Y., 
fifth edition, page 100. 



A, L. OrOOLE 


107 


making use of the algebraic equation in their derivations. The 
latter procedure will be followed here in order to emphasize the 
fact that the interest is not in the theory of equations but in a 
set of variates such as might appear for instance in a statistical 
problem. A few of the general formulae of symmetric functions 
will be developed now by direct computation in order to demon¬ 
strate a basic theorem of this work—^a theorem which will be 
stated at the close of this chapter. 

Multiplying and the result is 



ict, ^■Xg+ ~ • '+x„) 

- (x^ i- 


n 

r? 

'' Z ^ 

Z 

i-t 

t=/ 

7 =' 



( 2 ) if) ^ I 3 ) ^ hence 

(^0 -( 2 )( 0 -( 3 ) 

Similarly, if a ^ v ^ 






/ il V iJ V u V \ 



108 


ON SYMMETRIC FUNCTIONS 


-Z » Z . ‘O'. 


= (u \/J ^ { U-^\/) 
Cui/) - (a) (^\/) - (u- 


, hence 


However if v a iiiov..acation is necessary, hor then 








- Z “ - Z ‘ ^ 

i^i 

- (iTi) + 2 (u^) and thus 

Ziiu)- { u,)^ - ( B Li) where the bar over 2 a in¬ 

dicates ordinary algebraic niultiplicaiion of 2 and a , i e. 

(277) 

If uf^v^w, u^L'^w, y '1 ^z, then 

. .;*r ■ •) 


n n ff 

= ^ u. V \/>/ u-tv' IV ^ <- 

E^c^j^k '^i •»>• i/- 



A. L. O’TOOLE 


109 


4 =/ 4 »/ 

J =•/ 

• (uyw)^(u’hv^vy)4>(v’hW^u)-i‘(u'4’W, v) h(u-*-y ’f yy) 

the commas being used to separate the parts of the partitions. 
Now applying the result obtained for ( a ) to the second, third 
and fourth terms on the right of this last expression, it becomes, 
since 

(u’hv, w) = ia +v)(w) - (a-hvi-y/) , 

{ a) ^ (v'hw)(u) - (u^v-i^w) ^ 

(u'fyy.v) - (u^w)(v) - {¥w) ^ 

C^zjfv'if \n)-(uv w)i’(ui‘y)( ivj'/ (y'^w)(u)^(u w)(y)-Z(ui-v^vy). 

Finally 

(a vw)<=^fa)(v)M’‘(ui-v)(w)-(yi’yy)(u)-(u-fw){v)^^2,(u4’V+vy) 


If u^v^vy , then a modification is again necessary, and 
repeating the multiplication with a-y^\N it is found that 

(3!(u^) - (a)^ ’'J(^^)(u) ’^^(3a) 



no 


ON SYMMETRIC FUNCTIONS 


In like manner, if a+s/ \a/ > etc., 

U. + V + w 4=^ z , etc., then 

(uV*V£)=(u}{v}(wXz) ~(u)(v){w+i) -fu){uv)f kV 2 j 
~{u)f£)(i/Fw)-fy){if/)(ur-i‘)-(y)f^)fu-iyv) 
~(w)(b)(u^\/) •h2(u)(i'+W+z)-h2{v)(u + Wtz) 
*^(>iv)(u-fV-fz)-f-2(ik)(u-hV i-v-j) + (u + v)(w+z) 

+(u + viw-ti) *iu*w)(v+z) +(ui- - 6/^i/ f 1 /+ IV + t). 


If u^ 1 /== z , then 


4!(u*) = (u)^-6fu) (?u)-t-3CuX-3u)->■3(^71)^-6C4u) 


= s. 




' & s s ^ 

LL Six £<x 


<5s 


4u 


Similar modifications are necessary when some but not all 
of the parts of the partition are equal. For example, 

(aff v) = 6 :“+ xj-f ■ -i-x^) 

h'. <■ J r^, >■ •J f-J ‘ 


*=/ 

j=t 


4=/ 


L^l 

k^t 


77 

L 

’^(2u + y)-i-(Zu,\^)-^ 2(uf^,u) +2(u^ v) 

~ (2uK\/) ■(■2(u)(u-i-v)-4(2u + v) ■*■ £fu^v) 



A. L. OrOOLE 


111 


hence 


(u)%) ■{^u)M-2fu)fu-^v) + v) 


~ S - S , 3 - 2s,. S ■‘Z- 2 5 

«> a^u-f-v ^Bcl^v 


^ JUL ¥ 


3* Proceeding after the above fashion, any symmetric func¬ 
tion whatever can be expiessed in terms of the power sums. 
However, the process becomes increasingly cumbersome and the 
geneial formula is of no practical value for the purpose of com¬ 
putation. MoreoA er, it is necessary to use a continuous process, 
that is,to work from the simpler symmetric functions of small 
weight to the more complex symmetric functions of greater 
weight. 

A special case may be worth mentioning to illusLif'e still 
the rallying out of the direct process in the general case. 

iu)^ - 

Applying the iiiultinoniial theorem and assuming that the law 
hold-) for ^ -1 and that the symmetric functions of weight less 
than t are known and transposing all the terms of the right 
nici.iher except the term involving {a*), it is found that 




V-f'/ 


J*'* • -. t O.J - aj. 


\.hcre a, , a^, a^, . . . . , are either positive in- 

tegers or zeros such that a, -f ^ v and 

2a^ ^ ^ to. / - ^ • 



112 


ON SYMMETRIC FUNCTIONS 


In particular, if a = 1, then 




v*t tK!) ‘ 


(t) 


!• B' 


(X ! Cl ^ Cl f cl 
t 2 5 * t' 


a ! 


This last result may be expressed very conveniently in de¬ 
terminant form. Starting wich the results obtained in article 2, 
it is seen that 

lf(/) = S, , 




3 . / 

Sg S, 


y.{T) 


s, i o 

Si 3, 2 

S3 Sg s, 


4 /r/"; 


s, ! o o 

Sg s, 2 O 

S3 Sg s, 3 

^■ 4 - ^3 *2 




A. L. O'TOOLE 


113 


S, 

/ 

0 • - 

• 

• 


0 

Sa 

5/ 

^ 0 

• 

• 

‘ 

0 


Sa 

4 


0 

a 


• 

0 

s*-a 


' Sa 

4 


« * 

t2 

0 



. 

S 3 



t! 

Sf 

St-I 

5t-2 • 


53 


s. 


To establish this general law it is sufficient to note that the 
development of this detenninanl gives as a general term 



where a, , . . • . ^ are positive intc^rs or 

zeros which satisfy the conditions a ^ 

and ^ ^ . ta^ 

Hence the determinant is equal to 





where^as before, the summation is over all the different terms^ it 
is possible to obtain by assigning .a, , . . . , all 

positive integral values or zeros which satisfy the conditions 

a, + ~ ‘ ^ -h a. f ‘V , 

Of + £ag -h - ■ ■ ■ f- ta^ ' t . 






114 


ON SYMMhTRK I‘UNCTIONS 


A- This chapter will be concluded here with the statement 
of a very important theorem which may now be written and which 
will serve as a basis for the developments in the chapters to 
follow. 

Basic Theorem: 

Any symmetric function (defined in article 1) may be ex¬ 
pressed as a rational, integral, algebraic function of the power 
sums. 

Further, each term in the expression for the symmetric func¬ 
tion in terms of the power sums is of ilic- same weight as the 
symmetric function itself. Hence a term which does not arise 
from a partition of the weight of the symmetric function cannot 
appear in the expression in terms of the power sums. 



A. L. O'TOOLE 


IIS 


CHAPTER II 


A Differential Operator Method of Computing Symmetric 
Functions in Terms of the Power Sums 

5. Consider a symmetric function { c . .) of 

weight w of the variates oc, ^ . . oc ^. By the 

theorem demonstrated in chapter I and stated at the close thereof 
it is possible to write 

(a'^h^c ^ f (s,,Sg, • ' • , 

where f stands for a rational, integral, algebraic function of the 
power sums S,, 5^, • . . , and where each term in 

i is of total weight w, i. e. isobaric. 

In the preceding chapter the direct method of computing a 
symmetric function in terms of the power sums has been illus¬ 
trated. But that method has two major disadvantages. In the 
first place, it is necessary to know the expressions in terms of 
the powder sums of the symmetric functions of lower weight; and 
in the second place, it becomes altogether impractical for any¬ 
thing but the simplest cases. It is proposed to develop a method 
which will have neither of these disadvantages—in other words, 
to develop a method which will express any given symmetric 
function directly in terms of the power sums without knowing 
the expressions for the symmetric functions of lower weight, and 
which will not become too unwieldy. In addition, the method 
ought to lend itself readily to the construction of tables of sym¬ 
metric functions in terms of the power sums. 

The method developed here will be a differential operator 
method. It may be stated at the outset that many schemes for 



116 


ON SYMMETRIC FUNCTIONS 


determining differential operators which will do the work are 
possible. The writer has investigated a number of them. The 
operators developed here are given because they seem to satisfy 
best the demands just imposed on the method of computation. 
In fact, their simplicity and the directness with which they pro¬ 
duce results indicate that they are the simplest differential oper¬ 
ators that can be developed for the problem. 

6. Suppose now that a new variate ^ is intro¬ 

duced. What effect will it have on ( a'*‘ b^c"^. . .) and on 
f ? First consider ( a'*b^C^. . •)• Since all the variates 
enter the symmetric function in exactly the same way, new terms 
involving k in all the ways in which the other variates appear 
will be introduced. For example, if the original set of variates 
is JC, , o", , jr,, and the original symmetric function 
(32) = 2 jc^ Xj , i then this symmetric function is 
made up of the terms 


£ 







V 

V 


s a 
X X 

X^X^ 

S 4 

X* X* 

"^4 4 

Introducing a new variate 

jTy = k , produces the new terms 

k 

x^k 

x,k 

x^k^ 

k 


kx^ 

kx‘ 

or that is, produces 


and ZxJ k 

*. And since k is 

a constant with respect to the summation, these summations may 

he written k 

and k’Z X > i 

2, 3, 4. 


4 4 . 4 4 

Hence ^ becomes 

/=/ i J 

J-f 




A. L. O’TOOLE 


»7 


i. e., (32) becomes (32) + ^"*(2) + ^■^(3). 

Similarly k must enter ( q.'*' b ^c ^ . . . .) Just as 

every other variate does. As a result new terms are produced 
and . . . .) becomes ( a*'b^ 

ia^ cj''*• - ') * . 

Next find what happens to f ( . > *3^) 

when the new variate introduced. From the 

definition of the power sums it follows that 

Sf becomes k » 

3ji becomes t , 

Sj becomes sj -i* ^ ^ , 


becomes Sf ■f' , 


5 ^ becomes 6^ + * 

Hence f ( 6/, ) becomes 

f(^i + k, + • • • ,• Sw+ 

Taylor's series for several variables is 

i'ihd/dx^^kd/dy^mdjdB^ • • *)f 

(hd/dx + kd/dy v- md/d& + • ' 2] 

+ ihd/dx ’t’kd/dy -h Y 3 / 









118 


ON SYMMETRIC FUNCTIONS 


where the multiplication of operators is algebraic. 

Applying Taylor’s series to the function under consideration, 
the result is 

f(s,+k, s^+k^ • ■ ', 

■h(kd/dSi + kd/ds^ -t- • • - ' + k i 

-hCkd/ds, + kd/dSg V- • - • • + k'^d/da^Y 
+(kd/ds^+ kd/dSg^-* . • • ' +k'^d/da^Y 

+ (kd/ds, + kd/dsg+ • • • ’ + k'^d/da^Y'^t t 

all other terms being identically zero. 

Now let 

d, =6/6 s, , = 6/6 Sg ,-, 

c(^ = 6/6s^ , • ><= J, • • • ♦, IV. 

Then ctf == (<i/6s,)(ti/6sd = d^/6a^ and 

similarly 

It is now possible to write 
f(s,*k,s^-^k^, - ,^^*k'^)=f 

-h{kd, + k^d^+k^d^ + .... + k'^d^)f 
+(kd,-hk%i^ + k^d^-^ ' • • • + k'^d„f 
Hkd, + k^dg + k^d^+ • - • + k'^djY^ 


■i-{kd,+k’^dg + k^d^+ ■ ■ • • * k'^d^)'^ ^ 






A. L. OTOOLE 


119 


Multiplying out and collecting coefficients of powers of k , 
this becomes 




' ^ * / 2 *3 


■k-Djf, 


all other terms vanishing, where 


(1) 


y.D=d^%6cl,d, ^6d^, 

4lD-d;^£d%*a4d,d, ^/Edl*Z4d^, 

4 ./ /<c * -a £ 4 ' 

5iD^d^20d^d*60d’‘d*60d,d‘4ZOdd dZOd^d^ dZOd^, 
6iD^d%y)cCd^*l20dfd^ *l80dfdl -Jeadfd^ 

■^JZOd.d^d, *720d,d^ *7^0d^d^*t^0dl*360dl^7^0d^, 


etc. 

multinomial theorem and then picking out the 
the general term in this coefficient is found to 

d^d^d^-. . 
a t? c 

A! bfCh • • 

where a , b , c y . . . and 4,6 , CT , . . • are positive 
integers which satisfy the condition clA -t b3 4^cC + • • ' '-t . 


Applying the 
coefficient of k 
be of the form 



120 


ON SYMMETRIC FUNCTIONS 


Hence 

tid^dif df 

t{Df B!C' where a/4 + 6jB v-c C + -f; 

i. e. the sum of all the different terms which can be formed by 
assigning to a y C, . . . . , A y B ,C. . • all positive 
integral values which satisfy the condition 

From the above relations it follows also that 


( 2 ) 


Zd, -{T)r2D,), 

3d,-0)^4 .34). 

4d^-0r4D^D^ *ZDI*4D,D^ ~4D^), 

*5P,% *5D,J)l-5D^D, -5D,D^ *5D^, 

6d^^-Wr6D^D^ -6P\ 

*6D,D^ * 6D^D^ -/?2)/ + J£>/-6P^). 

( v-D It 

At 3! Ct . 


Z(-0 


w'here a,6,c, . . . . . are positive 

integers and where the summation is over all the different terms 
which it is possible to obtain by assigning positive integral values 
to d, b y c , . . A , B , C, . . . which satisfy the 

conditions A + B'*^C-h . . , »y ^ ciA^Jb3*cC* • • 

7. Now since ^ , therefore replac¬ 
ing f by (<2**^ , . .) the effect of the introduction of 



A. L O'TOOLE 


121 


the new variate f be written 

(hkDf-»'k^£)^ -f- ^ k ^ 'c ^ ‘ iu' b^c - O 

-tk‘"(o.'^''b^c. Uk‘^(a'^b^'c. 


Equating coefficients of equal pov^ers of k , u is obvious that 


D^(a^b^c t. ) 


a6-y , ^ 
(a Ok. 


■ ), 


(3) 


DJa^h^c ='..■} = (o.’^h c • ), 


and also that 

Df.(a*'b^c^' ■ •) - O \i r is not arnon;' a,b,C; 

The relations between d, and /? given abo\’e enable one to 
/s y 

express ( £? C. . .) iii terms of the power sums. 

One particular case is worthy of mention. If 1 is hoc among 
< 5 ?, ^ , c , . , . then J?, {cL*^ c . . .)*0 and hence 
cf, f - 0 and therefore also - 0. 

In this case the operator relations may be written simply 



122 


ON SYMMETRIC FUNCTIONS 


J/ ^ 6e{^ y- 3cl^ ’* , 


etc. 


and 


(2') 


d, « O, 


dg ”■ 2), , 


dg’D,, 

2!d^^2D^ - 13 /, 
<, ^Ds '-DgD^ , 


l_etc. 

Hence when 1 is not among a y b ^ a f * . • then s, 
cannot appear in the expression of ( a**' , . .) in terms 

of the power sums, i. e. all the coefficients of terms involving 
6, vanish identically. But it must not be assumed that if 
s, » 0 then ctff - Ordinarily this will not be true. It is 
necessary to find and in it set s, « 0. In statis¬ 

tics s, = 0 corresponds to the case where the variates are 
grouped about their arithmetic mean, i. e. so that - 0. 

8- The application of these operators €i and I> to the 
computation of a symmetric function in terms of the power sums 
will now be demonstrated. After that their use in the construc¬ 
tion of tables will be considered. 



A. L. O'TOOLE 


123 


Suppose it is desired to express (3^) in terms of the power 
sums. The only terms which may appear are given by the par¬ 
titions of 6 . There are eleven partitions of 6* Hence let 


(3 <Z, + a^s,% y 


+ < 2 ^ *^ 9^9 * «««« «// ®« - 

Since (3^) does not contain 1 as a part, O 

and S; cannot appear on the right-side of the above equation, 

CL, * <2^ = 42^ ^ a.^^Cty -O, 

Now operate on the left side of the equation with and 
on the right with . 


hence 0 - 3 a ^ and therefore <^3 « 

Operating on the left with and on the right with Ct^ gives 
< 2 ^ =r J4 since (3 ^ - (3) and ^ , i. e. 

S 3 . Operating on the left with 6 and on the 
righf with ^ ^ gives 0*^6 < 2 ^ 

-#■ 6 and thus = ■" Hence 

( 30 = (^ 3 -^( 0 /^- 

Similarly let 


S! ^ 

Operate on the right with d, and on the left with . 



124 


ON SYMMETRIC FUNCTIONS 


This gives 

5^ - 20 a, s/ y- 6 < 2 ^ s, 5^ ^ ^ 
hence 


Operate un the right v/ai.i o., tne Icit witn 

(D^-2Di)- Then - 5^-A s, + 2 a f, Sj 

and Oj <^6 ^J 

Operate on the right witn and on the left with 

' i? - ADJ. Then 

-4cl^S^ , < 2 ^=-/ 

Similarly, oi>erating on th<* right with 5 aiiCi (ui the lett 
with its equivalent in terms of D , the result isS-Sa^, < 2 ^* 1 . 
Hence 

(^1^) = (s^s^-2s, S^-S^Sj )/c? ‘ 

In the case of (3 ) the operations on the left were ]:)er- 

formed with X> , Z)^ , JD^ and and on the right with 

tiieir equivalent expressions in lerms of y , 

ds ^ d^> with Dj=d^ = 0. In the case of (31 the op¬ 
erations on the right were peifoiTned with 2d.^y and 

Sd^ , and on the left with their equivalent expressions in terms 
of i? , ^ * Obviously it is immaterial from 

a theoretical point of view which procedure is followed. For 

practical purposes it will usually be found that the procedure 

1? 

followed in the case of (31 ) is preferalde. 

9* The application of the operators to the construction of 
tables of symmetric functions in terms of the jowei suiiis will 
now be illustrated. 



A. L. O’TOOLE 


»S 

Weight 1: 

/. (O^s,, 

Weight 2: 

I>,(l^)’‘d,(a,sf+a^s^). a,^1/2. 

2Dg(l^) = (ctf*2d^)(a,Sf^+ag^s^)^ = « '■t/z. 

(l^)^(s^.s^)/2. 

Weight 3: 

For all the symmetric functions of weight 3 f will be of 
the form 

f^<2,s,^*a^s^s, +Ci^S^. 
df=3a, 5 ,® +agS^. 

^d^* 6d^dg* 6d^ )f => & (a, * +a^). 

/. (3) = s^. 

3. (20’S^s,~a^, since T>,(2i)»(2) * s^; therefore 
a,-0,ag^/; 6Dj(2/)-Oi zxAheact = 

3 , (l^)^(s,^-3s,s, *2s^)/6 since J>,{/‘)-OV 
and ('/V ‘‘fsf-Sg)/£ : 



126 


0:\ SYMMhTRIC IVACTIONS 


therefore a, = 1/6, 

6DJI^y-0 henct ^ ~ - ^ !/3, 

Weight 4: 

For all the symmetric lunctions of weight 4 f will have 
tht. fonn 

f = a,sf^^a^sfs^ * s, ^ < 2 , ^ 

f= 4a,2a^s,* a^s^. 

{dy2cL/)f=2(6o.,^ays^* 2ra^^2uy . 

(d.‘^-4-l2cLfd^ +24d,<i^ ■l■l2d^■^■24d/)f-24(Q,TOg-a^ra^ +q^). 

z. (2^)‘(sl-s^ya smee 0,(2^) =0, 
a,=ag = a^=0; 2Pg(2V~ 2<'2) = 2s ^ , 

24Dja^)=0, = -//£. 

A (3!)’ SjS, - since S', (31)= (3)= , 

a, = <2g^O, ag=rl, 21>g{3l)=0, 

=0 ; 24(3!) = 0, <a^ - /. 

4 . i2l^)-(s^Sg-2s,s^-s^^2s^)/8 niiice 

J>. (2f^)-(2lU Sgs, - . a, . f/2, - /; 



A. L. O’TOOLE 


127 


ZD^ (2!^) -2(1^) = (s^- ),2a^ =- //e, 

a^--//e ; 2^D^(2i^)^ O, -/. 

5 (i*) = (sf-C sfs^ ^ds,s^ ^3s^- e<sJ/24, 

since D, OV ‘(l V = fs,^-3s^6,*2s^) /6 

a^-//J :2D^0V-O, 

Weight 5: 

f^Q, sf+ a^sfsg +a_g sf's^ -t- cc ^®/ ^ ' ‘V®« ^a**^7 ®5'* 
d, f= 5a, sf’-t- 3a^f ®/ ® 3 ■'' • 

(df+2d^) f ^2(l0a, *ajsf*20a^ ^2czjs, 

+2fa^ *a.,s) «3- 

(d,^*20d,'^d^ ^■60d^d^ *60dd‘*/20d^^ -^/20ci^e(^ 

*/20d„)f‘ I20(a,^a^*ag+Q^*Qg^ai-7ay). 

/. (fJ^Sg 

i. ( 3 Z) -S^ » 


since D,(<32.)~0, 



128 


ON SYMMhTRlC FUNCTIONS 


a, = <2g *Cj » “Oi 2D3t(3Z'i~ 2.(3), 

a^-l; /20d^ (•32)=0, 
a. (4^0 = s^s,-Sj , since i? (4/) ^ (4), 

a, -<a^ 6»^ =• = <?, as=l) 2D^(4!) =0, a^-O; 

I20cij40’‘0. <2, = -/. 

4. (2^/)~ - ^s, - 2 53-SjV-<£s^ )/2, since 

D/2^0‘(2V. = = 

= //£; 2D/2*/) ^2(20, ; 

/2OI>gf2^0=O. «,= /. 

« ’‘(^a *!*■ <^*4 ■*/ ■ ®3 *a ®3 

Di(3/^)“(31)^ (3^^a^=0, a^*J/2, a.^--/C 

2 J>^ ( 3 !^) ‘O, ~ -//?; / 20 D^( 3 !‘)“ 0 , e,^/ 

o 3,^-3®a. -4Sj.)/<5, 

since (2i ^) = /i*/ ‘), a,=0, 

ag.-//2. a^“-//e, o-s'/; 2l>g(2/^)“2(/^). 

^/6 ; / 20 D^ ( 2 /^)= O. a, = -Z/ 3 . 

7. (J‘) =(sf-/OsgSf^ +20e^sf+/5a^Si -3Os^s^'20^s^ 

+Z4sj)/I20, since D,(l^)‘{lV,<i,‘‘l/l^(^.<ia=-l/>2.Qg“ 1/6, 

= I/S, a^--l/4;2Dg(lV-0, ag-aj‘-^6;I20J^ (I )'0, 





A. L, O'TOOLE 


129 


Weight 6: 

■>-a^s^s^ + <2g5/ ^ a,^S3^ . 

d, 6Q,sf^4a^ 5^ 

+2a^ s/s, + 8 q.^ s^ s, V- a^ s^ s^^■ . 

{df+2d ^)f = 2(l'5ci, 4-Qg)sf'+2(6Q.^-*-2a^)s^S,^ 

+ 2(3cL^ + ag^) S 3 5, *2/a^-^3ag) sf *2(ag * )s^ 

(cLf*ed ,<4 <5<2'3J/ = 6(200., -^40^ -ho^) sf 

+ 6(4a^*4a,f*Of)s^ti, -t 6(a^i-a^2a.,„) S^ . 
(df-^JOd^^ +/20d,^d^ * /SOd^dl 1360 -t 7Z0d,d,fi^ 
+7e0d,d^ ^^7^Oc^gd^*/20d^■t36Od^^■ 720d^ )i 
* 720fa, *Qg +«3 ^<^0' 


!■ (6) - s ^. 

(3^)^(s^-s^)/2, since operating on this sym- 

metric function with . 2 , 6 Z)^ . 720 D^ and comparing 

coefficients of the symmetric functions thus obtained with the re¬ 
sult of the operations on f al>ove gives 

<2,0='/^^ 

3. = f-Sg - '3«^S2 +^«^)/6. For operat¬ 
ing with D, and comj>aring coefficients of JD, (d^) 0 with 

d, f above gives <2, * <2^ =" <3^ " ^ 4 - ' ' ^6 " = 0. 



130 


ON ^YMMLlRIi JLNCfJON.^ 


Similarly, operating with 2 - 1/6, a.^ « - 1/2, 

Operating with 6>Dq gives « 0. Operating with 720 JD^ 

gives a,/ ^ //3 . 

4 (4£)^3^s^-s^. 

5 (J/)^ S^S,- 5^. 

6. - S^S, - Sg -s^ . 

7. (4f^) -(3^sf-2-a^ V-f*. 

a (2^/^) s/- 4s^ SgS^+4 

*Ss^ Sg - s/+2s^ - <5s^ )/^. 
a 0/ ^,)^(s^ 5^-3 ^ s/- ^ «, 

^■6s^Si+36^s^ 4-2s^- 6s^)/6. 

to. (2!^) » (s^ 5,“*- 4 ^ s/ s,^ y- /^ s,^^20s^ 

- /6 s, - 13 5^ +3s^+ds\+16)/24. 

u. aV = /s s;=*45s^s^~ SO s *- 

y /44s^ s^ y SOs^ s^ - /3s^+40s^ -120)/730, 


Note that only the four operator relations given above have 
l^een used in finding the expressions for all eleven symmetric 
functions of weight 6. 



A. L, O'TOOLE 


131 


CHAPTER III 


Symmetric Functions of Symmetric Functions. 
A Problem in Sampling 


10. Consider again the r? variates 




Let 






77 - 

denote the 


'3 :X * ' 

power sums, the oc subscript being introduced here to keep in 
the foreground the fact that the summation is with respect to . 
Now raise each variate to the power m, where ?77 is a positive 
integer. Thus a new set of variates is produced, viz. Jtj 
* • • ' Suppose now that samples, each containing r vari¬ 

ates, ( r ^ v) , are drawn in all possible ways from these 77 
new variates. Obviously there will be samples. Denote^ 

them as follows: 


rd 



s • • • 

„ w <r 

■ • • ^2L • 


^ m 

- ^ • 

r:Z 

T 

• • ^ * 


m ^TTf 

r.*3 

^ r* 777 


n 










7f7 


L 



^Notation suggested by Editorial, Annuls of Mathematical Statistics, 1 
(1930), page 100. 










132 


ON SYMMETRIC FUNCTIONS 
r: i 

where 2,* = ^ ^ ^ variates appear¬ 

ing in the /"th sample. 

Further, let 



represent the power sums with respect to Jff. 

Now, since each is a symmetric function of certain of 
the . . . . , symmetric function of the 

2^- is asymmetric function of symmetric functions. The situ¬ 
ation here is then considerably more complex than in the preced¬ 
ing chapters. The problem now is to express any symmetric 
function of the in terms of the p<jwer sums with respect 
to JC. It is not difficult to imagine how much more complicated 
and tedious the direct computation is here than in the problem 
already dealt with. But these symmetric functions, particularly 
the power sums with respect to jg, play such an important role 
in the theory of sampling that it is now proposed to develop a 
differential operator method for expressing symmetric functions 
of the in terms of the power sums with respect to X . 

On account of the presence here of symmetric functions of 
both jc and it is necessary to modify the notation of the pre- 



A. L. O’TOOLE 


133 


cedii’g chapters. Let (a* b ^ C be the general sym- 

metric function with respect to cc and C • the 

same general symmetric function with respect to 2 . Under this 
notation the power sums with respect to ^ may be written 
(1)_ , (2) , . . ., ( ^ ) , and the power sums with respect 

to z become (1)^ , (2)^ , . . ■, 

11. Case 777*1. 

Consider first of all the case of samples when 777 »1. In 
developing an operator method for expressing ( a 
in terms of the power sums with respect to oc it will not be 
necessary to deal with this general case. For the operators de- 

veloped in chapter II will express ( b in 

terms of the power sums with respect to 2 . Hence all that is 
required is an operator method for expressing the powder sums 
with respect to 2 in terms of the power sums with respect to JC. 

That it is possible to express the power sums with respect 

to B in terms of the power sums with respect to JC can be 
demonstrated by direct methods. Recall the theorem stated at 
the close of chapter I and note also that in any power sum wdth 
respect to 2 each term is a symmetric function (a power sum in 
fact) of certain of the Each jc enters 

exactly the same as every other car and the power sum with 
respect to 3C is unaltered by interchanges or permutations of 
JC, y , . . ., JKjy. Hence the symmetric function with 
respect to ^ is also a symmetric function with respect to and 

therefore can be expressed as a rational, integral, algebraic func¬ 
tion of the power sums with respect to ^ar. Moreover, as before, 
each term in the rational, integral, algebraic function of the power 
sums with respect to jc will be of total weight wif the sym¬ 
metric function of the is of weight w; that is, the sym¬ 
metric function is of the same weight in ^ as it is in J! . This 
last conclusion follows directly from, the definition of the . 

Although the problem here is more complicated than that 



134 


ON SYMMETRIC FUNCTIONS 


in chapter II, nevertheless the approach to the problem in that 
case suggests a beginning here. Let 

where f is rational, integral, algebraic function of the power 
stuns with respect to or. Since (vs^) is of weight w, no 
power sum of weight greater than W can appear in r , i. e. no 
power sum higher than *5 ^. 

Introducing a new variate ~ ^ before, changes 

it has already been 

shown that this new f may be written 

where, if dy = ^ ^ , the relations between D and d 

are given by ( 1 ) and (2) of chapter II. 

What is the effect of the new variate ^ 77 ^/ 

(\A/)^ ? If no further assumptions are made then obviously 
there will now be samples. The introduction of new 

samples complicates things and no operator relations are obtained. 
It would seem desiiable to preserve the number of samples. This 
may be done by making suitable assumptions. Just as the new 
variate is arbitrarily introduced, so its behaviour in the sampling 
process may be arbitrarily determined in any way that will bring 
results. With this in mind, select any one of the original variates, 
say JCT/ . Let * /r» / • Now assume that 

so related with oc* that in the sampling process 
every sample which contains also contains , i. e. con¬ 
tains ( ^ /■ / ) . In other words, in order to keep the num¬ 
ber of samples the same, and are always taken to¬ 

gether in the samples. 

Now each variate appears in ( 1 )^ exactly 



A, L, OTOOLE 


135 


times. Hence a / appears C times in 

the new (1)^ . Therefore the new (1)^ is equal to the orig¬ 
inal fl)^ increased by = k * ^r-/ ' 

Similarly (2) ^ becomes (2)^ -^2 ^(l)^' ^ ? 7 -/ ^r-/ 

where the prime above m indicates here, and in what follows, 
that ( ^ ) 2 ' obtaind from ( t )2 by replacing n and r 
by 77-1 and r- 1 respectively in the expression for 
in terms of the power sums with respect to a: . For example, 
since (l)^ 

~ r-2 ’ ®/;jc * 

Applying the multinomial theorem to the samples, the effect 
of the new variate may be written 

(1) ^ becomes (1)^ v- , 

( 2 ) ^ becomes ( 2 )^ +2k(l)^t + ^^- 7 ,./ ^r-f > 

(3) ^ becomes (3)^ ■4- 3 k (2)^i +3k^(/)^. + C , 


(iv)^ becomes (w)^ + k^(w-2)^> 


-A. 

k''(w-v)^, + . 

. . 







136 


ON SYMMETRIC FUNCTIONS 


Now since (wj^ , therefore 

.. 


Equating v^jefficients of equal powers of k it follows that: 

4^-'4-v.^3-^^-34'. 

('•^4 = vfv •<''^-‘' 4 '. 

= wCw-/-^'4' • 

14 ^^ u>w . 

12. Before proceeding to the application of these operators 
it ought to be remarked that other sets of differential oiJerator.s 




A. L. OrOOLE 


137 


can be developed. For instance, it is possible to develop a com¬ 
plete set of differential operator relations by adding k to each 
of the given variates. But the operators thus obtained are very 
cumbersome in comparison with those developed abo^e. The 
statement made with respect to the operators developed in chap¬ 
ter II may be repeated here. There is every reason to believe 
that the differential operators developed here are the siniplesi 
that can be obtained for the problem. 

13. The use of the operators developed in this chapter will 
now be illustrated by computing a few power sums wiili lespect 
to Z in terms of the power sums with respect to . 

A Let 0)^ ^ ^ . Then 

Df fOz ^ IOC > 

• Hence 




2. Let 






C 






2! ^2^^)f , 

2 Yf / ^f ^ ^ * ^<2 ~ 77 ~ * 



138 


OX SYMMEIRIC FUNCTIONS 


3 . Let 

D, (3)^ = c(. f, 

3(2)^. ^ci,f, 

'^'n- 3 ^r -3 ' ®/;ar n-e ^r-Z ~ n-3^r- 3^ ^S;X 
= 3a, sf.^ * ^ 2 ^Z,JC’ “ y>-3^r~ 3 ‘ 

~ ^ ^ n-2^ r-s ~ n-3^r-3^' 

3!D^ (3)^ = (df+6ci,d^ ^6d^)f. 

^3 ° T>-/ ^r-/~^ m-z^r-z * ^ -n-z^ r-3' 

^ 5 ^ 2 = ^l;x^ "n-Z^r-z ~ ^ X 

'^(n-l^r-l " ^'n-S^r-z 77-3^r.3^ ^3. X ' 


4. Let 

f4>^= f = Ct,S,^^ i- ^ 6’^ + 

^3 '^1; X 3: X ^ ^4- :X * ^3 . X 

D,(4)^.,i,f. 

4(3)^. .d,f. 

^[77-4 ^r-4 ^/:ar ^ (77-3^ 7?-^^r4 ^?f;x^a:X 



A. L. OTOOLE 


139 


^(n-^ ^r-S '77-3 ^7—3 '77-4 ^r- 4 )^ 3 .‘srj 

:x ^ZX * ^3 ^3:X > 


^/ 77-4 ^K-4 ' ^Z ~ ^( 77-3 ^r-3 ~ 77-4 ^r-4 ^7 

’‘"^(77-S^r-e ■'^ ■77-3 ^j-.3 ‘77-4 ^r.4-)' 

ZiDj4)-(d;^ad^)f, 

^^77-3 ^ r-3 ‘^i:x ^^^(.77-3 ^r-Z ~ 77-a ^r-a") ^S;x 
= 2(^a,if-a2)sf.^ + 2(a^ + 2a.^)Bg^.^ , 

*^4~'^^ri-Z ^r-Z '77-3 ^r-3 77-4 ^r-4) ' 

4(D^ (4-1 » (d^7 lEdfd^ ^^4d, d, ^ I2d^-.24d^)i, 

* W ^r-/ “ ^ 77-2 ^r~2 ^^^ 77 3 ^r-3 “ ^ 'rj-4 ^r-4 * 


'' ri-4^r-4 ' ^i:3C ^ ^(n-3 ^r~3 r7-4^r-4)^i :x ^z:x 

^77-2^k ~2 ' 77-3 ^ r~3 ^ ^ * 77 -4-^ r^4-}'^t : :3C ^3:X 

'^(.77'2. ^r-2 ^'n-s ^r-3 rw-A- ^r-4 iX 

'^(rf-/ C '~T-rf~2^r-Z ^ n-3^r-3 

rj^A-^r-A-^Sai. :cc 



140 


ON SVMMETNIC FUNCTIONS 


14. Now consider the case where w is any positive intc^oi. 
Write ^ . Tlie operators developed in this cliapter 

will express any power sum with res|)Oct to B in tonns of tlie 
power sums with respect to ^, i. e. in terms of 

.• But obviously =■ ^rriy:x 

and hence the operators of this chapter will exj)ress any sym¬ 
metric function which is a power sum with respect to SB in 
terms of pow'er sums with resj^ect to viz. in terms of 
■Sm.-oc* . • • • where iJ? is a pos- 

itive integer. Hence the operators developed in chapters H and 
III will express any symmetric function of Hy , 4«1, 2. . . . 

rr? ^ . 

• * •* 7? ^r ' ^ ^ ^ a positive integer, in 

terms of power sums with respect to qr. In particular 

~ /?-/ - ac > 

(^^2 “ 77-3 ^r- 3 ■ ® m.-x * '^Cz ^r-Z ~ 7>-3 ^r-3^ 

C-l ^r-f 77-2 ^r-Z '-'n-3 ^r. 3 ) ®,s 777 ;x 

15. Consider again the case 777 =/. P,’‘ rt-i ^r-i * 

Pz ■ 77-2 ^r-Z > . 7 Pk ~ n-k ^r-k , 

Then* 

®/.Z . 

(P, -Pz)^z,x ‘ 

•^3.2 “/’a'S/l* 

^rlx *^(P3 -pU 


^Notation suggested in Editorial, Annals of Maihcmatical Statistics, 1 
(1930), page 104. 




A. L. O'TOOLE 


141 


"^(pt ~ ^^P-a ~ ^Pa ^ ■• ac •> 

etc. 

The question as to whether the coefficients in the above 
expressions follow any simple law now arises. Instead ot 
Pk * n-k ^ » write 

A ^ r . Let 

p,(p)-p. 

Pafp)^p-3p^^2p^. 

-p-7^^* 6p 

ct>'. 

Further, let be the expression obtained from P^ (p) by 
going back to subscripts instead of exponents. Then 

Pi‘Pt . 

Ps ‘Pi -Pz . 

f% ’‘p,-3pg , 

P ’‘P, -7Pz Pa ■ » 

etc. 

The expressions for • a > . . . . may 

now be written: 



142 


ON SYMMETRIC FUNCTIONS 


S/. a = : Ji 


'S^;a = '^ Kzx ■*- ^^z-x • 




jc'*‘ '^s • X '^^'^3*0: * 

: a V.-or ^ <5 P ^ ^ s,, J, •S 3 , 


^ S-4.-0C ^ 

etc. 

where, of course, /^ /^ ^ * * ' * is to be found by multi¬ 
plying P^(p) f^(p) Pt (P) .changing 

the exponents m the result into subscripts, e. g. To find 
first find -(p ^and ther 

change the exponents into subscripts, obtaining ^ 

One further step is necessary in order to emphasize the Ia^^ 
for the formation of these expressions for ‘ ‘ * 

They may be written in the form 




5 . 2i(5-±Ll^ + * 3j±iJL) , 

I 3! net 3f J 


— 


P ^p s 

^ / ■■jc ao 

2! H at 


^ ®^.- 




// > 3 / 


^ 

2i(3!)^ 






A. L. O’TOOLE 


143 


'f-Z 


til 


Pi 


Pj Pk 


I J K 
• «/ ^5. • 


(i!)^Cj!)'^(k‘)''' • -njffCh 


After computing by the direct method the first eight mo¬ 
ments, under the assumption that S = 0, an article^ 

which appeared in the Annais of Mathematical Statistics gives 
the following law for the formation of the functions ^tfp) 
for • ,8: If ^ the coefficient of 

^ ^ in the expression for Pf (p) , then 




This is equivalent to saying that 


m»o 






That this law holds for all values of f;*!, 2, . , . . 

is now easily established. For if it be assumed that this law 
holds for the expression for in terms of the power 

sums with respect to , then it holds also for because 

the operators , ' * • , and the equivalent oper¬ 
ators in terms of , • • *, will express 

in terms of the power sums with respect to z. and of weight 
less than t . And the coefficients of the terms in the expression 
for depend only on the coefficients of these 

power sums of weight less than t . e. g. Suppose the law 
holds for 2. Let 


^Editorial, Anmls of Mathematical Statistics, 1 (1930), page 107. 



144 ON SYMMETRIC FUNCTIONS 

Operate on the left with D, and on the right with . 

l>=nce 

Operate on the left with and on the right with (,clf 

f 6c^,^4 + 6c/^). Then 

CP, - C((^, * Q^), therefore 

'P,-P’-3P,P^ 

But 

p,(i^ - p;(p)-3r>(p) 5 (p) .p-p^. Jpfp-p‘) 

■ -? f'x 


Hence 


Q,‘P. 

16. Consider the functions ^(p), t= /,«?, • • /O^ • • 





A L O’TOOLE 


(/o)‘yO - ^ '3- <5^ t 

F^(yo)-yO-/3^^^ 30y0 60y0 ^4-^ f 

Pjyo) .yO-3lyO^^ /8C/> ^-390yO^+360/>^~/30/> 

P^ (p) .p -63p‘^ 602P ^-2/OOp^ 

■f-336p^- 2520p « ^ 720p =". 

r>(p)^p./2Zp^^/032p'^- /0206p * 

+P5200p^- 3/920p « ^ 2i9/60p S040p f 
PJp)=p-2SSp^^6050p-^-46620p^ *l66824p^ 

-3IFS20p^i-332640p^-m40p^^40326/i 
f>Jp) -/>- 57^% /8860p ^-2O4630p^ 

*/020600p '"'-2739240^ * ^■33^9424p'' 


~3760000p^^ /eJ44OOp^^36d880p“‘. 



146 


ON SYMMETRIC 


rUXCTIONS 


Those who are familiar with the calculus of finite differences 
will recognize the coefficients in the above expressions, neglecting 
their signs, as the numbers appearing in the table of values of 

If u(jc) and vf'X) are functions of jc then 
CYu(x). v(x)s v(>x) A”u(x)-h ■ A •^(x) A u(x*t) 

• • • 

Now x”= x-x”'' . Hence, letting vfxj*x and 

^ 777 ^ (jc^ /) and all the other terms vanish. 

Also (^^/) = B ^ ^ ^ A . Therefore 

ya 777 77 ATTt 77-/ A h t 

A X ^ X ll X ^ 7n A (UAjx 

^ A m 77-/ /y| m 77-/ A 777-/ 77-/ \ 

« orZl X ^m(A X x }. 

It is now possible to write 

fiM-Z (-0”(A'”/'')y>’-''. 

msO 

To show that this law is equivalent to the law given above, viz: 

assume they are equivalent for r> (^) and show that they 



3Mt 


A. L OTOOLE 


147 


are then equivalent for (p) That is, assume 

t-/ 

z 

m^O 








msO 




'm-t’ / 


1'hen 


' tlu tvv(^ Jav\s di(.*(|iiu iK n n 


o 

t 


(-0 


m 


t I ni-l t-i 

imt!) A / -h m A / 




m*i 


/77«0 ^ 


But this is true since if c - (• ! )^ A^ i ^ 

then 


=^-/; 


(^„^/)A'^ / Zi ’”-V 


‘(-/)'”A'^/*-\ 


Similarly, since m-t^m~i f 


/ / 


, then 


^ A i 


17. Since A^f+A)0^ , it is possible to 


write 



148 


ON SYMMETRIC FUNCTIONS 


t-/ 


7n*0 

-l (-/r 


m^o 


The latter expression on the right suggests that (^) 
may be expressed as a function ofyO and x*. with x set equal 
to zero for each particular value of t , Suppose that (x, 
is such a function. Obviously F can be neither a polynomial 
in X nor a rational function of any kind in x ; for setting 
X equal to zero would show that F would have the same value 
for all values of t . The nature of the expression suggests 
that X enters P only as a variable with respect to which dif¬ 
ferentiation is to be carried out, x then being set equal to zero. 
There are two main reasons for this assumption. Urst of all, 
since x enters the d‘-?erence expression only as a variable with re¬ 
spect to which differencing H performed, X being set equal to zero 
after each differencing, the guess is that x enters P only as a 
variable with respect to which differentiation is to be carried 
out, X being set equal to zero after each differentiation Be¬ 
sides this there is the intimate relation between A and d/ cfx . 
For instance, / + A , d/dxzz iog(/^A) 

and hence can be replaced by a function of the r 7 *th degree 
in cL/dx and vice versa. Further, since the difference ex¬ 
pression contains A it is reasonable to try to express as a 
function involving d */dx * . Now let F^( x, p)^ ^ - 

• Since "i differentiations, none 
of which are to give results identically zero, are to be carried 
out then ^ cannot be a rational function of or . Also functions 
which involve the ^ssibility of the derivative being infinite are 
excluded. Hence try a transcendental function of x sad/O . 
The exponential function will not satisfy the conditions. Try 



A, L. 0*T00LE 


149 


And again f cannot be a rational 
function of ^ . Suppose f is an exponential function of oc , 
say Then 

Pt(p)’ 




The simplest case would be e ^ . But this 

does not satisfy ^i(/o)^/=> ■ Nor does 
nor ~ pt 

does satisfy the conditions since it has been shown^ that 


dx* 


■ log (pe / -/o ) 




satisfies the law , 

where is the coefiicicnt of in 

Hence can be written iii the three c‘/i*rvalen: K)nn^ 

for all values of ^ : 


Pt(p)^L (-!) ^ 

m-o 




Pf(p)-l 


iff’O 


(ir)^l)c 


~ rr?i t-t 




777-/./ 






^Editorial, Annals of Mathematical Statistics, 1 (1930), pages 107, 108. 
Also see remark on “Sampling Polynomials,” page 120. 




FUNDAMENTAL FORMULAS FOR THE DOO¬ 
LITTLE METHOD, USING ZERO-ORDER 
CORRELATION COEFFICIENTS 


By 

Harold D. Griffin 

Dean of Crescent College, Eureka Springs, Arkansas 


So far as the writer has been able to determine, fundamental 
formulas for the Doolittle method as applied to the solution of 
nonnal linear equations expressed in correlation coefficients have 
never before been developed. Because of their peculiar telescop¬ 
ing qualities, the writer has termed them '‘endothetic formulas/’ 
Perhaps the best way to judge the respective merits of three 
methods of solving simultaneous linear equations to obtain the 
coefficient of partial regression (the j3's) —determinants, Kelley’s 
partial regression method/ and Doolittle’s direct substitution meth¬ 
od^—^is to compare the formulas by which each might be expressed. 


^Kelley, T. L. Chart to Facilitate the Calculation of Partial Coefficients 
of Correlation and Regression Equations. 1st ed. School of Education, 
Special Monograph No. 1. Palo Alto: Stanford University Publications, 
1921, 

^Wallace, H. A., and Snedecor, G. W. Correlation and Machine Calcula¬ 
tion. 1st ed. Official Publ. Vol. 23, No. 35. Ames: Iowa State College 
of Agriculture, 1925. 



// D CKIIIiy 


ISl 


TimEE-VARIABLE FORMULAS 


Determinants 


A 


r r r 


02 / ^2 
n 


r - r r 


/? o/ c 
^012" 


IZ 


Kelley’s 




r - r r 

*02 'oi 12 




" r r 

^ o / *^02 /2 


ois f.^z 
JZ 


%st~ hr^l 


Doolittle’s 


/^0/2 ^OS / 


OPERATIONS REQUIRED IN SOLVING A 
THREE-VARIABLE PROBLEM 



Determinants 

Kelley’s 

Doolittle’s 

Consulting tables 

1 

1 

1 

Adding 

0 

0 

0 

Subtracting 

2 

2 

2 

Multiplying 

2 

2 

2 

Divulin^j 

2 

2 

1 


In a three-variable problem the Doolittle method has but a 
very slight advantage over the Determinant method and the 
method used by Kelley in his Chart. 



1S2 


FUNDAMENTAL FORMULAS 


FOUR-VARIABLE FORMULAS 


ido 


Determinants 

-'^23 ^ ^3 ''^3 




- ^>'.3 (' i . ''*3^'■03 '■J 


02 13 


/5. 


-'■33 ""/a '■33 

_ ^/ '^~^02 ^>2 ~ ^ 0-3 '~ t 3 * '~23 ^^02 ^13 * ^03 ^ 12 ^ 


-^L ^^^2^3 >^23 

KeUey’s 




A 


02 ./a 


A 


0/ 


f' ’’ r r ^ f' t* 

*03 Of * t3 *02 Of ^13 

y . 

'23 ~ ^2 

/-. /*^ /— 

A 

’ * t3 

/ _ ^3 “ ^2 

p - r r 

^23 ^2 ' t3 



/3 

^2 " ^2 ^3 ■ '"of '"fs 

X 

r ~ r p 

23 /2 /3 

f-ni )-r= 


1-^.t 

^ ^3 ^ ^ ^3 “ 

■ Oa '"/j 

' i 2. 

/2 

r - r' r r - p r 

*af 02 tz 03 02 23 

y 

p p p 

f3 /2 23 

/-r^ J-r^ 

' /2 ' 23 

A 

t-r^ 

' ' 12 


/- 


r - r r 

/2 23 

I-r^ 

£S 


r -1 r 

^/3 /2 23 



H. D, GRIFFIK 


153 


Doolittle’s 


r - r r 

r* ” r "• <3/ y //^ - A' r ^ 

^03 *Ol 13 t ^(*33 ^t2 V3 

A --i_k__ 

r^o3./3 r -r r 

hr^ - -JJ. _ f^ J j^ xcr -r r ) 


A 


02 13 


r -r r 

02 'ot 


r ' r r 

23 /^ /3 






/e 


^0/23 ~ ^ot ^02.t3 ' ^i3 ^€>3 t2 

OPERATIONS REQUIRED IN SOLVING A 
FOUR-VARIABLE PROBI EM 



Determinants 

Kelley’s 

Doolittle’s 

Consulting tables 

6 

3 

2 

Adding 

12 

0 

1 

Subtracting 

4 

12 

4 

Multiplying 

18 

12 

8 

Dividing 

3 

11 

3 


In a four-variable problem the Doolittle method is seen to 
have a decided advantage over the other two. An examination 
and comparison of these fundamental formulas for three and 
four variables would seem to justify the conclusion that an in¬ 
creasing number of variables would but enhance the manifest 
superiority of the Doolittle method. 




ON A PROPERTY OF THE SEMI 
INVARIANTS OF THIELE 


By 


Cecil C. Craig 
National Research Fellow 


Given a general linear form 


(1) ac, ^ ^.^ 

of a set of statistical variables, oc ,, , • • , it 

is well-known that in case the variables, JCf , ,., 

are independent, in the sense of the theory of probability, that 
the r'th semi-invariant of this form is simply 


( 2 ) 




(O 






^ a 




(rd 

r 


in which is the r^th semi-invariant of . This is per¬ 

haps the most important and useful property of semi-invariants. 

Each semi-invariant is defined as a certain isobaric function 
of the moments of weight equal to the order of the semi-invariant. 
The question to which this note is devoted is whether among such 
isobaric functions, the property given above belongs uniquely to 
the semi-invariant. This problem is equivalent to another which 


iTliCre is no loss in generality m supposing the origin so chosen for each 
that the constant in the form is zero. 

2Thiele, T. N., Theory of Observations (C. & E Layton, London, 1903) 
‘p, 39. 




C. C. CRAIG 


iSS 


seems more difficult to state verbally. The r’th semi-invariant 
of the form (1) is itself found in terms of the semi-invari¬ 
ants, ^ r 3 f ^ ‘ ‘ ^ the 77 -way probability function 

P(a:^ , ,.means of a symbolic multinomial 

expansion. Now in order that the above property may hold 

generally it is necessary and sufficient that the cross-semi¬ 
invariants of JC, ^ ,.should vanish if , 

independent; that is, that each 

which at least two of the quantities O • • sire different 

from zero, should vanish identically. Now are semi-invariants 
the only such functions of moments, whose ^‘cross’’ members be¬ 
have in this way? 

The semi-invariants of the given linear form are de¬ 
fined by 

g L.t . . . 


(3) 


< 


d F(jCf 


*(2aiJCi) 


which is to be regarded as a formal identity in t. And the semi- 
invariants of jc ,, cCg. • ■ ■ are given by 




(3) 

+■ ■ 




( 4 ) "J ar 

- ^ (i v:- 




iWe shall observe the distinction between probability functions and fre¬ 
quency functions suggested by H. Cramer in his important memoir: “On 
the Composition of Elementary Errors,** Skandinavisk Aktuarietidskrift, 
1928, p. 13. By a probability function we mean what has been called the 
cumulative frequency function and thus in the above we are using an 
7?-way Stieltjes integral. 







156 


SEMI-IXVAHJANTS OF THIELE 


which is also a formal identity in t, , ' » ^/j • 

The quantities (Z and (Z refer 

to symbolic multinomial expansions, perhaps most easily explained 
by means of examples. Thus 




and 








(9) 


^ 90 ^; 


■^ 09 *: 




t. rj 


inwhich ^ .., 

in our first used notation, and A,,^ ^ , A^,j,. ^ , etc. 

are cross-semi-invariants of x, and . 

Then by inspection of (3) and (4) it is evident that 


(5) 


Ljj^* (Z 1 ,^, 3 ^ . 


In case the variables ' » '^y? indepen¬ 
dent of each other F ( JC,, or,, • . - - ) splits up into the 

product F (or,) (x^ • • (xjof the probability functions 

of the separate variables, becomes equal to the expression 
(2), and all the cross-semi-invariants in the expansion of the 
right member of (4) become identically zero. That the vanish¬ 
ing of these cross-semi-invariants is not only a sufficient but i^ 
also a necessary condition that assume the value (2) is evi¬ 
dent from the absence of any restrictions on F ( jf,, jc^, “ *» 
(except that it be an 77 -way probability function) or on the set 

Now each cross-semi-invariant is expressed as a certain iso- 
baric function of moments, some of them cross-moments. But 





C. C. CRAIG 


1S7 


in the case of independent variables, 

^rsi. » 


and when this is true, the value of each cross-semi-invariant be¬ 
comes identically zero. To illustrate this and for use in the dem¬ 
onstration that the semi-invariants are the only such functions, 
let us write out the fourth order semi-invariants of P (csr ^ 

' ’ •,) in terms of moments. These are obtained by 
equating coefficients of like terms in 

fz A, f, (2 M v;. A ; 

( 6 ) -3[(£v, f 

Leaving off superfluous zeros in the subscripts, this gives 
for example 

* ^02 <5 -l/, 6 a// VJ. 


If in the value of we set > 

etc., then H 0 as it was already known must happen. 

For the sake of simplicity let us suppose, at first, that the 
component variables in (1) are all “equal,” that is, that F ( , 

'3c^,., F ( vzr, JT,.). In,the case of 

iThe general formula giving semi-invariants in terms of moments is to be 
found in several places. See e. g., C. Jordan, Statistique Mathematique 
(Gauthier-Villars, Paris, 1927), p. 41. For an elementary derivation and 
also for an extended example of the use of semi-invariants of a correlation 
function of several variables see the author’s “An Application of Thiele’s 
Semi-invariants to the Sampling Problem,” Hetron, Vol. VTI, No. 4 
(1928), pp. 3-74. 






158 


SEMUINVARJA^JTS 01 1 illLLJl 


independence among tX,, , - • , we can write also 

F, (jc,)* fa:^)= .. (^ 17 ) * P (oc). An 

equivalent assumption is that all moments and hence all semi¬ 
invariants of the same type of P ( oc^. , • * ' *, ) are 

equal. (Moments of the same type are all those with the same 
combination of digits in their subscripts.) Then the expressions 
for all the semi-invariants of the fourth order of F (o:^, , - * * 

• ' are equivalent to the following: 

«V''4VV 1 /^ 

A “t/ “A/ "v/ +3i/ i/J "3V V i/ 

A •V ~{2.'\l "J '^ZyJ 1 / -^ZV V^) “6V^ 

' an 2M 2/ #0 '^2oV/ ^ *'// ' *\20 /O ^ iC 

X =V -4V V i/^ -6 t/^ 

411# •#/## ^“#1*^10 ^ ^ // #0 to 

Now, our general isobaric function of the moments of weight 
four can be written 


(B) 


f(B, V, 4 (i V, ^ /"'(z V, t,) 

-32, [(f,;. i;./"J (H t, f‘(iv, t, f- e^.fl V, t,)‘ 


And in our special case of equal component variables ‘ 

our problem is to determine for what sets of values of • -,2^ 

the coefficients of t,^t ^, tf and 

in the right member of (8) vanish identically if *' 

are independent. 

By comparison with (7) it is seen that this gives four linear 
equations with which to determine the five unknowns. But we 




C. C, CRAIG 


159 


can add a fifth equation by stating that the coefficient of is 
in general a parameter which in the case of independence is a 
function of P Co:) and ^ , which we shall desig¬ 

nate by . Then we have for the determination of : 

o 

o -6v;-^ 



-■^ 3 ^ 


/2V^V,^ 

-6V,* 





-6V/ 

< 




V 

'O 

1 





-6v; 

< 

-4V,^ 

-3y/,^ 

/2V* 

-6V/ 


By adding each of the four other columns to the first col¬ 
umn in the denominator, we have at once in view of (7), 





A 


unless the identical first minor of numerator and denominator 
vanishes. But this can happen only if there is linear dependence 
between the corresponding elements in the four rows of this minor 
which in turn can happen only if there is a linear relation between 
the quantities ^ > 3^d (Such a linear de¬ 

pendence would exist if the second or third semi-invariant of 
Fix) is zero.) ^ 

Moreover, it is readily seen that we get 2,« 

(Of course we suppose ^ O and moreover C^= 0 could 
hold only fo^ some F (oc) ’s) 

If we no longer suppose the components ^ 

“equal” in the sense defined above, the quantities in (7) may be 
replaced by summations of all terms of the same type or summa¬ 
tions of all products of terms which are coefficients of similar 



160 


SEMI-INVARIANTS OF THIELE 


terms in ’s. Thus in place of >^ 4 ^ . >4,. Xo m the first 

equation, and \ and -v/^ in the second we now write, 

2 ^ 4 ^ = ^6 ■*■^ 04 *^ 00 ^'* • ■ ‘ 

EV«0 =V4o-^ V„4 + Voo4+ - - 

■*■ >^3 '^o, -"^ 003 ^ 00 , ■ 

2 ! ~ ^ 3 / *^«3I * ^0/3 * ■ 

^ ^>0 * 0 / '' 0 / d 3 'vo '* 3 © '^> 0 | *^063 

respectively. But otherwise our argument will be the same and 
lead to the same conclusion. 

It is obvious that the argument for weight four is perfectly 
general and thus that the same kind of conclusions hold for any 
weight. We conclude that the semi-invariants are the only iso- 
baric functions of the moments of a set of rr variables which 
have the properties described in the first two paragraphs indepen¬ 
dent of the probability or frequency functions of those variables. 

But if when the variables are independent the probability 
function of each one is such that there is an isobaric relation 
among the moments of order lower than k , the same for each 
variable, then there are other isobaric functions of order k and 
higher which enjoy the property of semi-invariants in question. 
And it will be shown that the only isobaric relations among the 
moments of order < k , mentioned above, which lead to the new 
isobaric functions of this type of order > k , are obtained by 
setting semi-invariants of order < k , equal to zero. 

Let us return to the case in which the weight is four. Then 
^ 3 * > 3 ’ ^ 0 , the minor of our denom¬ 

inator D vanishes, and so, of course, does the corresponding 
minor in the numerator. Then as a matter of fact there is a 
double infinity of the sought isobaric functions of weight four. 



C*. C. CRAIG 


161 


Some of them are given by the following sets of values of the 2 's. 


2 , 





5 

2 

5 

2 

1 

6 

3 

6 

3 

2 

9 

3 

9 

3 

1 


as may be verified by actual computation* 
Now we also have^ 


fo ic a ^ f 

from which we can write in place of ( 8 ) 


( 10 ) 


yJ-y, A )(Z v, n r 


in which we can seek to find sets of values of .so 

that the coefficients of ^3 ^ 4 . and f, 

will vanish when the oc’s are independent. This will give us four 
homogeneous linear equations in which the determinant of the 
coefficients vanishes identically since - 1 is a 

solution. Addition of the second, third and fourth columns to 
the first gives a new first column of zeros. But if, say, A^= 0, 
in addition to A^,, and A,,, which already vanish if the jc's are 
independent, then the elements of the fourth column are all zeros 
also, and our determinant is of rank not greater than two. But 
since the solution of the set of equations arising from ( 10 ) is 
equivalent to that arising from ( 8 ), the minor , of £) in ( 9 ) 


iThiele, T. N., loc. cit, p. 25. 






1«2 


SEMI-INVARIANTS Of THIELE 


must vanish in case 0. 

But sinrp Z -2 -••= 2 -»lisa Solution of the equations 
(8), it is easy to see that if in , the sum of the last three 
columns be added to the first column, the resulting first column 
will be identical, though opposite in sign with the last four elements 
of the first coulmn of D . Let us indicate the new by D* . 

Now there is a linear dependence between the elements of 
the rows of . In fact the elements of the first row minus 
three times the corresponding elements of the third plus twice the 
corresponding elements of the fourth ( 

must give zero for each element. For suppose there exists an¬ 
other such linear relationship between rows. This linear relation¬ 
ship must hold between the corresponding elements of the first 
column of DJ , and we have a new isobaric relation between the 
moments of oc . But a probability function F {oc)can always be 
found in which 

( 11 ) 

holds and the other relation does not. But for the 's in 

which (11) holds D'^ must vanish, and thus the relation between 
columns must be that given by (11). 

Thus D,f contains as factors and A, . That it 

contains no others can easily be verified directly. 

The cases of weights two, three, and four are easily handled 
directly throughout. If the weight is now Jh greater than four, 
our argument readily generalizes. The equations now arising 
frc«n the relation corresponding to (10) are now greater in num¬ 
ber than the unknowns y,. ., , but it is obvious 

that the matrix of the codficients is of rank not greater than Jlr-2. 
And it follows just as before that • ■ • / , are 

all factors of the new . 

The argument above which shows for the weight four, that 




C C. CRAIG 


163 


is a factor of does not show that there cannot be other 
linear relations between the elements of the first column which 
are also factors of . It only shows that if there is such a fac¬ 
tor, the corresponding linear dependence holds for certain rows 
of D,, . 

Let us consider the case of weight five. The elements of the 
first column of D are now 

^ and the elements of the first column of are the last six 
of these with opposite sign, and they thus correspond to the par¬ 
titions of 5. We know that one of the two sets of three rows of 
D,f , the second, fourth, and fifth or the third, fifth, and sixth, 
are connected by the linear relation corresponding to ~ '3 t4 

A,-0 so that A , is at least once a factor of D,, . If 
we suppose that the first set of three rows are so related, does it 
follow that this same relation holds for the second set? Now it 
is easy to see that if in the second row i/,^ be everywhere sub¬ 
stituted for the resulting row will be identical with the third 
and that the same is true of the fourth and fifth rows and of the 
fifth and sixth. Then if a certain linear relation holds for the 
first set of three rows, by the substitution of *i/^ for every¬ 
where in it, it follows that the same relation holds for the second 
set of three rows also. Thus is twice a factor in for 
weight five. We note also that the partitions of 3 (counting 3 
as a partition of 3) are twice found with common factors among 
the partitions of 5, that is, 32, 221, 2111; and 311, 2111, 11111. 

The argument is readily generalized^ and in case of of 
weight k , each semi-invariant of weight r < fc is a factor of 

^The general argument is based oil the principle that the second row of D 
is obtained from* the process which gives the first by replacing one factor 
by , the third from the first by replacing by // , the fourth 
f^m the first by replacing if by , and so on (see (6) and (7)). 

Thus in the case of weight six, to compare the three rows beginning with 
with the three beginning with 

we replace the in the first set which arises as a coefficient of 
by and the two sets of rows become identical. 



164 


SEMI-IKVARJANTS OF THIELE 


as often as the partitions of r are found with common factors 
among the partitions of k . (We count as a partition of r .) 
Thus for weight four, Ej,® A3 Aj which gives D», the cor¬ 
rect weight sixteen. In case of weight five, D„* A^ A' 

which again gives D,, the correct weight thirty. And it is easy 
to show by induction that in case of weight k this method gfives 
D„ its proper weight. Among the partitions of k are found 
all the partitions of k - 1 with a part 1 added to each. Thus each 
of these adds k to the total weight. For the partition k - 2, 2, 
it is seen that the remaining partitions of k - 2 with the common 
additional part 2 will be found among the remaining partitions of 
k and that the remaining partitions of 2 with the common addi¬ 
tional part k~ 2 will also be found. Thus this partition con¬ 
tributes the weight k to the total. And sim arly it can be seen 
that every partition of k contributes k to the total weight of D,,, 
which was to be proved. 

Finally, then, we have the additional result that the necessary 
and sufficient condition that more than oni‘ ».Nol>aric function of 
weight k of the moments of the probability vancbles jc^, 4^, 
exists which has the semi-invariant properties in question, is that 
the probability functions of ^ , • • • •, in case of ijidepen- 

dence are such that for some r < A . vanishes for each of 
them. 


Stanford University. 






THE THEORY OF OBSERVATIONS 


By 

T N. Thiele 


EDITOR’S NOTE 


Thiele’s “Theory of Observations” constitutes a classic 
contribution to both mathematical statistical theory and the theory 
of least squares. Unfortunately, his researches, and in particular 
his semi-invariant or “half-invariant” theory, have not received 
the recognition in this country that they deserve. Since, accord¬ 
ing to importers of books, the “Theory of Observations” is now 
out of print and copies are rare, the editor has deemed it advis¬ 
able as a matter of policy to make this work in this way available 
to the readers of the Annals, 

This reprint should also be construed as an acknowledgment 
of our indebtedness to Mr. Arne Fisher for his unswerving en¬ 
deavors to bring before American statisticians the important con¬ 
tributions of Danish and Scandinavian writers. 




(lONTENTS. 


I'lti nitilui', Ptee 

ji 1 . Belief in Canwility. 1^5 

j;‘ 2 . The ObKervations iwifl thoiv Circnniiiiaiiceji. 166 

Errors of Observations. 167 

i?4. Theoretical and Emiiirioal Science. 163 

U. Laws of Errors. 

Oil He|ieiiti.»us. jgg 

Liavs (if Aetna) Kn-iiiti and fiaivs of l*rehU!iiiitive Errors . .. . 1^9 

^7. The liHvv (if Lujre Nniubevs of Hejietitiuns . . . I 70 

§8. Four Different Eonus of Laws of Errors . 171 

HI. Tabalar Arrangements. 

§ 9. Fre(jiieney and I'rolmhility. •• •• 172 

§10. Kepetition^ with (Qualitative Differcneeh between the Kesults . 172 

§11. Repetitions with (Quantitative Differences between the Results. fjj 

l\'. (Jurves of Errors. 

§12. (htrves of Actual Kn*urs of Observatioiih in Discontinued Values. 174 

§ 13. Curves of Actual Errors for Rounded Observations. 17 4 

§14. Curves of Presnmptivo Errors. * 75 

§15. Typical Curves of Errors. 178 

§16. JParticular Measures of Curves of Errors. 176 

Y. Functional Laws of Errors. 

§ 17. 1. Their Detenniiiotioii by Interiwlation. . 

§13. 9-8. The Typical or Exponential Law of Errors. Ig 3 

Problems. 183 

§19. 9 -13. The Binomial Fuiietions. 184- 

§20. 14. Home of the more general Functional l^&ws of Errors. Series . jgj 





















Numbers of XVII. Mathematical Expectatlaa and its Mean Error. 

foromUe Paso 

§74, Mathematical Expectation . 301 

138—140. Examples.302 

§75, 141—148. Mean Errors of Mathematical Expeetation of Unbound Eventa. 303 

Examples. 304 

§76. 144-146. Mean Error of Total Mathematical Expectation of the Same Trial. 305 

147. Examples. 306 

§77. The Complete Expression of the Mean Errora . 306 











I. THE LAW OF CAUSALITY. 


Si 1. 0 start with thr assumption that emythmg thnt exists, mid everything 

tkiif exists or happens as a neeessary eonsequenee of a previous state of things. 

If a slate of things is repeated in every detail, it must lead to evactly the same consequences. 
Any difference between the results ot causes that are in part the same, must be evplainable 
by some difference in the other part of the causes. 

This a8.sumption, whi( li may he called the law of causality, cannot be proved, but 
must be believed; in the same way as wc believe the fundamental assumptions of religion, 
with wbich it is doselv and intirnaiely connected. The law' of causality forces itself upon 
ear bcliel. It may be denied in theory, but not in practice. Any person who denies it, 
will, il lio is wat(*hlul onougli, catch himself constantly asking himself, if no one else, why 
tins has happened, and not thiit. But in that very question he bears witness to the law 
ol lausalil). If we arc consislenlly i(» deny ilic law of causality, we must repudiate all 
oUsor\.iiion, and particularly all prediction based on past experience, as useless and misleading. 

11 we could imagine for an instant that the same complete combination of causes 
could have a delinite number ot different consequences, however small that number might 
be, and thal among these the occurrence of the actual consequence was, in the old sense 
ol the word, accidental, no observation would ever be of any particular value. Scientific 
obser\atioiis cannot be rc(oncilod with polytheism. So long as the idea prevailed that the 
result (»i a journey depended on whether the powei of Njord or that of Shade was the 
siioiigci, or that victory nr deleui in battle depended on whether Jove had, or had not, 
Imteuod to Juno's complaints, so long were e\en scientists obliged to consider it below their 
dignity to consult observations. 

But if the law of causality is acknowledged to be an assumption which always 
holds good, then every observation gives us a revelation which, when correctly appraised 
and compared with others, teaches us the laws by which God rules the world. 

We can judge of the far-reaching consequences it would have, if there were con¬ 
ditions in which the te of causality was not valid at all, by considenng the cases in 
which the effects of the law are more or less veiled. 



166 


In inanimate nature the relation of cause and effect is so clear that the effects are 
determined by observable causes belonging to the condition immediately preceding, so that 
the problem, within this domain, may be solved by a tabular arrangement of the several 
observed results according to the causing circumstances, and the transformation of the 
tables into laws by means of interpolation. When, however, living beings are the object 
of our observations, the case immediately becomes more complicated. 

It is the prerogative of living beings to hide and covertly to transmit the influ¬ 
ences received, and we must therefore within this domain look for the influencing causes 
throughout the whole of the past history, A difference in the construction of a single 
cell may be the only indication present at the moment of the observation that the cell is 
a transmitter of the stiU operative cause, which may date from thousands of years back. 
In consequence of this the naturalist, the physiologist, the physician, can only quite ex¬ 
ceptionally attain the same simple, definite, and complete accordance between the observed 
causes and their effects, as can be attained by the physicist and the astronomer within 
their domains. 

Within the Hving world, communities, particularly human ones, form a domain 
where the conditions of the observations are even more complex and difficult Living 
beings hide, but the community deceives. For though it is not in the power of the com¬ 
munity either to change one tittle of any really divine law, or to break the bond between 
cause and effect yet every community lays down its own laws also. Every community 
tries to give its law fixity, and to make it operate as a cause; for instance, by passing it 
off as divine or by threats of punishment; bat nevertheless the laws of the community 
are constantly broken and changed. 

Statistical Science which, in the case of communities, represents observations, has 
therefore a very difficult task; although the observations are so numerous, we are able from 
them alone to answer only a very few questions in cases where the intellectual weapons of 
historical and speculative criticism cannot assist in the work, by independently bringing to 
light the truths which the communities want to conceal, and on the other hand by re¬ 
moving the wrong opinions which these believe in and propagate. 

§ 2. An isolated sensation teaches u& nothing, for it does not amount to an ob¬ 
servation. Observation is a putting together of several results of sensation which are or 
are supposed to be connected with each other according to the law of causality, so that 
some represent causes and others their effects. 

By virtue of the law of causality we must believe that, in all observations, we get 
essentially correct and true revelations; the difficulty is, to ask searchingly enough and to 
understand the answer correctly. In order that an observation may be free from every 
other assumption or hypothesis than the law of causality, it must include a perfect 



167 


description of all the circumstances m the world, at least at the instant piereding that at 
which the phenomenon is observed. But it is dear that this far surpasses what can be done, 
even in the most important cases. Real observations have a much simpler form. By giving 
a short statement of the time and place of observation, we refer to uhat is known of the 
state of things at the instant; and, of the infinite multiplicity of circumbtances connected 
with the observation we, generally, not only disregard everything which may be supposed to 
have little or no influence, but we pay altention only to a small selection of circumstances, 
which we call essential, because ue evpetl, in virtue of a special hypothesis concerning 
the relation of cause and effect, that the observed phenomenon will be effect of these 
circumstances only. 

Nay, we arc often toinpelled to disregard certain circumstances as 
though there is no doubt as to their influencing the phenomenon; and we do this either 
because wo cannot get a sufficient amount of trustworthy intoiiiiation regarding theni, or 
because it would be impracticable to trace out their connection with the efteci. For 
instance in statistical observations on mortality, wliere the age at the time of death can 
be regarded as the observed phenomenon, we generally mention the sex as an essential 
circumstance, and often give a general statement as to residence in town or country, or as 
to occupation. But there are other things as to which we do not get sufficient information; 
whether the dead person has lived in straitened or in comfortable circumstances, whether 
he has been more or less exposed to infectious disease, etc.; and we must put up with this, 
even if it is certain that one or other of these things was the principal cause of death. 
And analogous cases are frequently met with both in scientific observations and in everyday 
oc« nrrcnces. 

In ordei Iti obtain a perfect observation it is necessary, moreover, that our sensations 
should give us accurate information regarding both the phenomenon and the attendant 
circumstances; but all our senses may be said to give us merely approximate descriptions 
of any phenomenon rather than to measure it accurately. Even the finest of our senses 
recognizes no difference which falls short of a certain finite magnitude. This lack of 
accuracy is, moreover, often greatly increased by the use of arbitrary round numbers 
for the sake of convenience. The man who has to measure a race-course, may take into 
account the odd metres, but certainly not the millimetres, npt to mention the microns. 

§ 3. Oxi'ing to all this, &eery actneU observation is affected uith errors. Even our 
best observations are based upon hypothesis, and often even on an hypothesis that is cer¬ 
tainly wrong, namely, that only the circumstances which are regarded as essential, influence 
the phenomenon; and a regard for practicability, expense, and convenience makes us give 
approximate estimates instead of the sharpest possible determinations. 

Now and then the observations are afteefed also by gross errors ^^hieh, although 

1 * 



168 


not introduced into them on purpose, are yet caused by such carelessness or neglect that 
they could have been, and ought to have been, avoided. contradistinction to these we 
often call the more or less unavoidable errors accidental. For accident (or chance) is not, 
what the word originally meant, and what still often lingers in our ordinary acceptation 
of it, a capricious power which suffers events to happen without any cause, bit only a 
name for the unknown element, involved in some relation of cause and effect, which pre¬ 
vents us from fully comprehending the connection between them. When we say that it 
is accidental, whether a die turns up “sis” or “three”, we only mean that the circumstances 
connected with the throwing, the fall, and the rolling of the die are so manifold that no 
man, not even the cleverest juggler and arithmetician united in the same person, can suc¬ 
ceed in controlling or calculating them. 

In many observations we reject as unessential many circumstances about which we 
really know more or less. We may be justified in this; but if such a circumstance is of 
sufficient importance as a cause, and we arrange the observations with special regard to 
it, we may sometimes observe that the errors of the observations show a regularity which 
is not found in “accidental" errors. The same may be the case if, in computations dealing 
with the results of observations, we make a' wrong supposition as to the operation of some 
circumstance. Such errors are generally called systematic, 

§ 4. It wiU be found that every applied science, which is well developed, may be 
divided into two parts, a theoretical (speculative or mathematical) part and an empirical 
(observational) one. Both are absolutely necessary, and the growth of a science depends 
very much on their influencing one another and advancing simultaneously. No lasting 
divergence or subordination of one to the other can be allowed. 

The theoretical part of the science deals with what we suppose to accurate 
determinations, and the object of its reasonings is the development of the form, connection, 
and consequences of the hypotheses. But it must change its hypotheses as soon as it is 
dear that they are at variance with experience and observation. 

The empirical aide of the science procures and arranges the observations, compares 
them with the theoretical propositions, and is entitled by means of them to reject, if 
necessary, the h}pothese8 of the theory. By induction it can deduce laws from the obser¬ 
vations. But it must not forget — though it may have a natural inclination to do so — 
that, as shown above, it is itself founded on hypotheses. The very form of the observation, 
and especially the selection of the circumstances which are to be considered as essential 
and taken into account in making the several observations, must not be determined by rule 
of thumb, or arbitrarily, but must alwajs be guided by theory. 

Subject to this it must as a rule be considered best, that the two sides of the 
science should woA somewhat independently of one another, each in its own particular 



169 


way. In what follows the empirical side will be treated exclusivelj*, and it will be treated 
on a general plan, investigating not the particular ivay in which statistical, chemical, phy¬ 
sical, and astronomical observations are made, but the common rules according to ivhich 
they are all submitted to computation. 


IL LAWS OF EEROES. 

§ 5. hlvery observation is supposed to contain information, partly as to the 
phenomenon in which we are particularly interested, partly as to all the circumstances, 
connected ivith it, which are regarded as essential. In comparing several observations, it 
makes a very great diiference, w'hether such essential circumstances have remained unchanged, 
or whether one or several of them have changed between one observation and another. 
The treatment of the former case, that of repetitions^ is far simpler than that of the latter, 
and is therefore more particularly the subject of our investigations; nevertheless, we must 
try to master also the more difficult general case in its simplest forms, which force them¬ 
selves upon us in most of the empirical sciences. 

By repetitions then we understand those observations, in which all the essential 
circumstances remain unchanged, in which therefoie ihe results or phenomena should agree, 
if all ihe operative causes had been included among our essential circumstances. Further¬ 
more, we can without hesitation tieat as repetitions those observations, in which we assume 
that no essential circumstance has changed, but do not know tor certain that there has 
been no such change. Strictly speaking, this would furnish an example of observations 
with systematic, cirors; but provided there has been no change in the care with which the 
essential circumstances have been determined or checked, it is permissible to employ the 
simpler treatment applicable to the case of repetitions. This would not how'ever be per¬ 
missible, if, for instance, the observer during the repetitions has perceived any uncertainty 
in the records ol a circumstance, and therefore paid greater attention to the following 
repetitions. 

4; 6, The special features of the observations, and in particular their degree of 
accuracy, depend on causes which have been left nut as unessential circumstances, or on 
some overlooked uncertainty in the statement of the essential circumstances. Consequently 
no speculation can indicate to us the accuracy and particularities of observations. These 
must be estimated by comparison of the observations with each other, but only in the 
case of repetitions can. this estimate be undertaken directly and without some preliminary 
work. The phrase law of errors is used as a general name for any mathematical expres¬ 
sion lepresenting the distribution of the varying results of repetitions. 



\7Q 


Lfltt-i of actual errors are such as correspond to repetitions actually carried out. 
But observations yet unmade may also be erroneous, and where we have to speak hypo¬ 
thetically about observations, or have to do with the prediction ot results of future repe¬ 
titions, we are generallj obliged to employ the idea of “laws of errors”. In order to pre¬ 
vent any misunderstanding we then call this idea “Za«s of presumptive errors". The two 
kinds of laws of errors cannot generally be quite the same thing. Every variation in the 
number of repetitions must entail some variations in the corresponding law of errors; and 
if we compare two laws of actual errors obtained from repetitions of the same kind in 
equal number, we almost always observe gi:ea.t differences in every detail. In passing from 
actual repetitions to future repetitions, such differences at least are to be expected. More¬ 
over, whilst any collection of observations, which can at all be regarded as repetitions, will 
on examination give us its law of actual errors, it is not every series of repetitions that 
can be used for predictions as to future observations. If, for instance, in repeated measure¬ 
ments of an angle, the results of our first measurements all fell within the first quadrant, 
while the following repetitions still more frequently, and at last exclusively, fell within the 
second quadrant, and even commenced to pass into the third, it would evidently be wrong 
to predict that the future repetitions would repeat the law of actual errors for the totality 
of these observations. In similar cases the obsenations must be rejected as bad or mis¬ 
conceived, and no law of presumptive errors can be directly based upon them. 

§ 7. Suppose, however, that op comparing repetitions of some observation we have 
several timw determined the law of actual errors in precisely the same way, employing at 
first small numbers of repetitioDS, then larger .and still larger numbers for each law. If 
then, on comparing these laws of actual erthrs''with one apother, we remark that they be¬ 
come more alike in proportion as the numbers of repetitions grow greater, and that the 
agreements extend successively to all those details of the law which are not by necessity 
bound to vary with the number of repetitions, then we cannot have any hesitation in using 
the law of actual errors, deduced from the largest possible number of repetitions, for pre¬ 
dictions concerning future observations, made under essentially the same circumstances. 

This, however, is wholly legitimate only, when it is to be expected th if toe couid 
obtain repetitions in indefinitely increasing numbers, the lato of errors would then approach 
a single definife form, namely the law of presumptive errors itsdf, and would not oscillate 
between several forms, or becqme • altogether or partly indeterminate. (Note the analogy 
with the'dtiference hetWSen -con^ging and oscillating infinite series). We must therefore 
distinguish between good and bad observations, and only the good ones, that is those which 
satisfy the above mentioned condition, the law of large numbers, yield laws of presumptive 
errors and afford a basis for prediction. 

As we cannot repeat a thing indefinitely often, we can never be quite certain that 



171 


a given method of observation may be called good. Nevertheless, wc shall always rely on 
laws of actual errors, deduced from very large numbers of concordant repetitions, as suffi¬ 
ciently accurate approximations to the law of presumptive errors. 

And, moreover, the purely hypothetical assumption of the existence of a law of 
presumptive errors may yield some special criteria for the right behaviour of the laws of 
actual errors, corresponding to the increasing number of the repetitions, and establish the 
conditions necessary to justify their use for purposes of prediction. 

We must here notice that, when a series of repetitions by such a test praves bad 
and inapplicable, we shall nevertheless often be able, sometimes by a theoretical criticism 
of the method, and sometimes by watching the peculiarities in the irregularities of the laws 
of errors, to find out the reason why the given method of observation is not as good as 
others, and to change it so that the checks will at least show that it has been improved 
In the case mentioned in the preceding paragraph, for instance, the remedy is obvious. The 
time of observation is there to be reckoned among the essential circumstances. 

And if we do not attain our object, but should fail in many attempts at throwing 
light upon some phenomenon by means of good observations, it may be said even at this 
stage, before we have been made acquainted with the various means that may be employed, 
and the various forms taken by the laws of errors, that absolute abandonment of the law 
of large numbers, as quite inapplicable to any given refractory phenomenon, will gener^ly 
be out of the question. After repeated failures we may for a time give up the whole 
matter in despair; but even the most thorough sceptic may catch himself speculating on 
what may be the cause of his failure, and, in doing so, he must acknowledge that the 
error is never to be looked tor in the objective nature of the conditions, but in an insuffi¬ 
cient. development of the methods employed. From this point of view then the law of 
large numbers has the character of a belief. There is in all external conditions such a 
harmony with human thought that we, sooner or later, by the use of due sagacity, parti¬ 
cularly with regard to the essential subordinate circumstances of the case, will be able to 
give the observations such a form that the laws of actual errors, with respect to repetitions 
in increasing numbers, will show an approach towards a definite form, which may be con¬ 
sidered valid as the law of presumptive errors and used for predictions. 

§ 8. Four different means ot representing the law of errors must be described, and 
their respective merits considered, namely: 

Tabular arrangements, 

Curves of Errors, 

Functional Laws of Errors, 

Symmetric Functions of the Bepetitions. 

In comparing these means of representing the laws of errors, we must take into 



172 


consideration Tfhich of them is the easiest to employ, and neither this nor the description 
of the forms of the laws of errors demands any higher qualification than an elementary 
knowledge of mathematics. But we must take into account also, how far the different forms 
are calculated to emphasise the important features of the laws of errors, i. e. those which 
may be transferred from the laws of actual errors to the laws of presumptive errors. On 
this single point, certainly, a more thorough knowledge of mathematics would be desirable 
than that which may be expected from the majority of those students who are obliged to 
occupy themselves with observations. As the definition of the law of presumptive errors 
presupposes the determination of limiting values to infinitely numerous approximations, 
some propositions from the differential calculus would, strictly speaking, be necessary. 


m. TABULAE AEEANGEMENTS. 

§ 9. In stating the results of all the several repetitions we give the law of errors 
in its simplest form. Identical results will of course be noted by stating the number of 
the observations which give them. 

The table of errors, when arranged, will state all the various results and the fre¬ 
quency of each of them. 

The table of errors is certainly improved, when we include in it the relative fre¬ 
quencies of the several results, that is, the ratio which each absolute frequency bears to the 
total number of repetitions. It must be the relative frequencies which, according to the 
law of large numbers, are, as the number of observations is increased, to approach the 
constant values of the law of presumptive errors. Long usage gives us a special word to 
denote this transition in our ideas: probabiliiy is the relative frequency in a law of pre¬ 
sumptive errors, the proportion of the number of coincident results to the total number, 
on the supposition of infinitely numerous repetitions. There can be no objection to con¬ 
sidering the relative frequency of the law of actual errors as Em approximation to the 
corresponding probability of the law of presumptive errors, and the doubt whether the 
relative frequency itself is the best approximation that can be got from the results of the 
gi^en repetitions, is rather of theoretical than practical interest. Compare § 73. 

It makes some difference in several other respects — as well as in the one just 
mentioned — if the phenomenon is such that the results of the repetitions show qualitative 
differences or only differences of magnitude. 

§ 10. In the former case, in which no transition occurs, but where there are such 
abrupt differences that none of the results are more closely connected with one another than 
with the rest, the tabular form will be the only possible one, in which the law of errors can 



173 


be given. This case frequently occurs in statistics and in games of chance, and'for this 
reason the theory of probabilities, which is the form of the theory of observations in which 
thfifie ease& are particularly taken into consideration, demands special attention. All pre¬ 
vious authojs iiave begun with it, and made it the basis of the other parts of the science 
ui observation. I am of opinion, however, that it is both safer and easier to keep it to 
the last. 

§ 11. If, however, there is such a difference between the results of repetitions, 
that there is either a continuous transition between them, or that some results are nearer 
each other than all the rest, there will be ample opportunity to apply mathematical methods; 
and when the tabular form is retained, we must take care to bring together the results 
that are near one another. A table of the results of firing at a target may for instance 
have the following form: 


1 foot to the left 


1 foot too high. 3 

Central . 13 

1 foot too low. 4 

Total ... 20 


Centiftl 

1 foot to the right 

Total 

17 

6 

2C 

109 

19 

141 

8 

1 

13 

134 

26 1 

1 180 


If here the heading foot to the left" means that the shot has swerved to the 
lAi between half a foot and one foot and a half, this will remind us that we cannot give 
ho ei.act measures in such tables, but are obliged to give them in round numbers. The 
number of results then will not correspond to such as were exactly the same, but dis¬ 
regarding small differences, we gather into each column those that approach nearest to one 
another, and which all fall within arbitrarily chosen limits. 

In the simple case, where the result of the observation can be expressed by a 
single real number, the arranged table not only takes the extremely simple form of a table 
of fuiutions with a single argument, but, as we shall see in the foUbwing chapters, leads 
us to the representation of the law of errors by means of curves of errors and' functional 
laws of eirors. 

It,is an obvious course to fix the attention on the two extreme results in the table, 
and not seldom these alone are given, instead of a law of error, as a sort of index of the 
exactness of the whole series of repetitions, and as the higher and lower limits of the 
observed phenomenon. This index of exactness, however, must be rejected as itself too 
inexact for the purpose, for the oftener the observations are repeated, the farther we must 
expect the extremes to move from one another; and thus the most valuable series of 
observations will appear to possess the greatest range of discrepancy. 


2 






174 


Oil the other hand, if, in a table arranged according to the magnitude of the 
values, we select a* single middle value, preceded and followed b) nearly equal numbers of 
■values, we shall get a quantity which is very well fitted to represent the whole series of 
repetitions. 

If, while we are thus counting the results arranged according to their magnitude, 
we also take note of those two values with which we respectively (a) leave the first sixth 
part of the total number, and (i) enter upon the last sixth part (more exactly we ought to 
say 16 per ct.), we may consider these two as indicating the limits between great and small 
deviations. If we state these two values along with the middle one above referred to, we 
give a serviceable expression for the law of errors, in a way which is very convenient, and 
although rough, is not to be despised. Why we orght to select just the middle value and 
the two sixth-part values for this purpose, will appear from the following chapters. 


IV- CURVES OP ERRORS. 

§ 12. Curves of actual errors of repeated observations, each of which we must be 
able to express by one real number, are generally constructed as follows. On a straight 
line as the axis of abscissae, we mark off points corresponding to the observed numerical 
quantities, and at each of these points we draw an ordinate, proportional to the number 
of the repetitions which gave the result indicated by the abscissa. We then with a free 
hand draw the curve of errors through the ends of the ordinates, making it as smooth 
and r^lar as possible. For quantities and their corresponding abscissae which, from the 
nature of the case, have appeared, but do not really appear, among the repetitions, 
the ordinate will be — 0, or the point of the curve falls on the axis of abscissae. Where 
this case occurs very frequently, the form of the curves of errors becomes very tortuous, 
almost dlscontinnous. If the observation is essentially bound to discontinuous numbers, for 
instance to integers, this cannot be helped. 

§ 13. If the observation is either of necessity or arbitrarily, in spite of some in¬ 
evitable loss of accuracy, made'in round numbers, so that it gives a lower and a higher 
limit for each observation, a somewhat different construction of the curve of errors ought 
to be applied, viz. such a one, that the area included between the curve of error, the axis 
of abscissae, and the ordinates of the limits, is proportional to the frequency of repetitions 
within these limits^ But in this way the curve of errors may depend very much on the 
degree of accuracy involved in the use of round numbers. This construction of areas 
can be made by laying down rec^^es between the bounding ordinates, or still better, 
trapezoids with their free sides approximately parallel to the tangents of the curve. If the 



175 


hmiting round numbers are equidistant, the mean neignts of tne tiaijezoids o: cctaivits 
are directly proportional to the frequenuec of repetition in this case a prehminarj cop- 
struction of curve-points can be made as m § 12, and may often be used as sufficient. 

It is a very common custom, but one not to be recommended, to drai^ a bfoketi 
line between the observed points instead of a curve. 

§ 14. There can be no doubt that the curve of errors, as a form for the lavv of 
errors, has the advantage of perspicuity, and were not the said uncertaintj in so 
cases a critical drawback, this ^\ould perhaps be sufficient. Mor 60 \er, it is in pract vG 
quite possible, and not very difficult, to pass from the curve of actual errors to one vihnn 
may hold good for presumptive errors; though, certainly, this transition cannot be founded 
upon any positive theory, but depends on skill, which may be acquired by working at good 
examples, but must be practised judiciously. 

According to the law of large numbers we must expect that, when we draw curves 
of actual errors according to relative frequency, for a numerous series of repetitions, first 
based upon small numbers, afterwards redrawn every time as we get more and more repe¬ 
titions, the curves, which at first constant!} changed their forms and were plentifully 
furnished with peaks and valleys, will gradually become more like each other, as also 
simpler and more smooth, so that at last, when we have a very large but finite number 
of observations, we cannot distinguish the successive figures we have drawn from one an¬ 
other. We may thus directly construct curves of errors, which may be approved as pictures 
of curves of presumptive errors, but in order to do so millions of repetitions, rather than 
thousands, are certainly required. 

If from curves of actual errors foi small numbers we are to- draw conclusions as 
to the curve of presumptive errors, we must guess, but at the same time support our guess, 
partly by an estimate of how great irregularities we may expect in a curve of actual errors 
for the given number, partly by developing our feeling for the form of regular curves of 
that sort, as we must suppose that the corves of presumptive errors will be verj regular. 
In both respects we most get some practice, but this is easy and interesting. 

Without feeling tied down to the particular points that determined the curve of 
actual errors, we shall nevertheless try to approach them, and especially not allow 
large deviations on the same side to come together. We can generall} regard as laigc 
deviations (the reason wh) will be mentioned m the cnapter on the Theory oi Probabiiititsi 
those that cause greater errors, as compaied with the absolute freque ncy of the res'iu 
in question, than the square root of that number (more exactly where h is tlu 

frequency of the result, n the number of all repetitions;. But even deviations two oi tinct 
times as gis/St as this ought not always to be avoided, and we may be satisfied, if ouiv 
one third of the deviations of the determining points must be called large. We may ist. 



176 


the word “adjustment^ (graphical) to express the operation by which a curve of presumptive 
errors is determined. (Comp. § 64). The adjustment is called an over-adjustment, if we 
have approached too near to some imaginary ideal, but if we have kept too close to the 
curve of actual errors, then the curve is said to be under-adjusted. 

Our second guide, the regularity of the curve of errors, is as an ffisthetical notion 
of a somewhat vague kind. The continuity of the curve is an essential condition, but it 
is not sufficient. The regularity here is of a somewhat different kind from that seen in 
the examples of simple, continuous curves with which students more especially become 
acquainted. The curves of errors get a peculiar stamp, because we would never select the 
essential circumstances of the observation so absurdly that the deviations could become, 
indefinitely large. Nor would we without necessity retain a form of observation which 
might bring about discontinuity. It follows that to the abscissae which indicate very large 
deviations, must correspond rapidly decreasing ordinates. The curve of errors must have 
the axis of abscissae as an asymptote, both to the right and the left. All frequency being 
positive, whore the curve of errors deviates from the axis of abscissae, it must exclusively 
keep on the positive side of the latter. It must therefore more or less get the appearance 
of a bow, with the axis of abscissae for the string. In order to train the eye for the 
apprehension of this sort of regularity, we recommend the study of figs. 2 & 3, which 
represent curves of errors of typical forms, exponential and binomial (comp, the next chapter, 
p. 16, seqq.), and a comparison of them with figures which, like Kr. 1, are drawn fro'm 
actual observations without any adjustment. 

The best way to acquire practice in drawing curves of errors, which is so important 
that no student ought to neglect it, may be to select a series of observations, for which 
the law of presumptive errors may be considered as known, and which is before us in 
tabular form. 

We commence by drawing curves of actual errors for the whole series of observa¬ 
tions; then for tolerably large groups of the same, and lastly for small groups taken at 
random and each containing only a few observations. On each drawing we draw also, 
besides the curve of actual errors, another one of the presumptive errors, on the same 
scale, so that th^ abscissae are common, and the ordinates indicate relative frequencies in 
proportion to the same unit of length for the total number. The proportions ought to be 
chosen so that the whole part of the axis of abscissae which deviates sensibly from the 
curve, is between 2 and 5 times as long as the largest oi^^ate of the curve. 

Prepared by the study of the differences between the curves, we pass on at last 
to the construction of curves of presumptive errors immediately from the scattered points 
of the curve which correspond to the observed frequencies. In this construction we must 
not consider ourselves obliged to reproduce the curve of presumptive errors which we may 



177 


know beforehand; our task is to represent the observations as nearly as possible by means 
of a curve which is as smooth and regular as that curve. 

The following table of 500 results, got by a game of paiaence* may be treated in 
this way as an exercise. 


1 

Actual hrequency for groups of 


■ 







35 repetitions 












] 



u 


III 


IV 



V 



: I 

n 

III IV 


7 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

1 

0 


1 

1 

0 

1; 

8 

0 

0 

0 

1 

0 

2 

2 

0 

0 

1 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

4 

2 

0 

0 

9 

1 

3 

1 

1 

5 

3 

1 

1 

3 

2 

2 

0 

3 

1 

0 

3 

1 

1 

2 

1 

6 

10 

7 

7 

5 

10 

9 

2 

9 

5 

6 

6 

6 

4 

5 

4 

8 

3 

6 

3 

5 

6 

4 

6 

3 

4 

25 

22 

20 

17 

17 

11 

3 

6 

3 

3 

3 

6 

4 

4 

5 

5 

3 

5 

3 

7 

2 

5 

5 

6 

3 

8 

15 

17 

18 

17 

22 

12 

8 

6 

3 

4 

3 

3 

2 

8 

3 

7 

4 

6 

5 

4 

6 

5 

3 

3 

5 

7 

20 

16 

20 

20 

18 

13 

2 

4 

4 

3 

6 

3 

3 

1 

4 

1 

1 

3 

5 

4 

3 

6 

7 

3 

6 

1 

13 

13 

9 

18 

17 

14 

1 

2 

2 

4 

1 

0 

2 

3 

2 

1 

2 

4 

3 

5 

4 

0 

4 

0 

2 

4 

9 

6 

9 

12 

10 

15 

0 

1 

2 

2 

1 

1 

3 

2 

3 

2 

2 

3 

1 

1 

3 

0 

0 

2 

1 

0 

5 

7 

10 

5 

3 

16 

1 

2 

0 

1 

0 

1 

1 

0 

0 

1 

0 

1 

2 

0 

0 

0 

0 

a 

2 

0 

4 

2 

2 

2 

5 

17 

0 

0 

1 

0 

Oi 

0 

0 

0 

0 

0 

0 

0 

0 

0 

2 

0 

0 

1 

0 

0 

1 

0 

0 

2 

1 

18 

1 ^ 

0 

0 

1 

0 

0 

1 

0 

0 

1 

1 

0 

0 

0 

0 

0 

1 

0 

0 

0 

1 

1 

2 

0 

1 

19 


0 

Q 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 


25 25 25 25 25 35 25 25 26 25 25 25 25 25 25 23 25 25 25 

33 

j 100100 IGO100100 



3 (>0003 7 

00019 

7 00071 8 

00192 


35 


9 


00636 


101 


01005 
89 01054 
01021 
94 00934 


10 

11 

12 


70 


30 

15 

4 

5 
1 

600 


00628 


00705 

13 

00691 


00486 

14 

00387 


00298 

15 

00216 


00145 

16 

00088 


00046 

17 

00020 


00007 

18 

00002 


OOOOO 


09999 



The law of presumptive errors here given is not the direct result of free-hand con¬ 
struction; but the curve so got has been improved by interpolation of the logarithms of 
its statements of the relative frequencies, together with the formation of mean numbers 
for the deviations, a proceeding which very often will give good results, but which is not 
strictly necessary. By this we can also determine the functional law of errors (Comp, the 







ITS 


next cliapter). The equation of the curve ia. 

Logy «= 2^0228 -f- O'OOSO (ar—13)—0*6885 11)*4- 0*01515 (^— 11 -^u*uUl 6 <'b (j;—1 1 r 

§ 15. By the study of many curves of presumptive errors, and especially such as 
represent ideal functional laws of errors, we cannot fail to get the impression that there 
exists a typical form of curves of errors, which is particularly distinguished by symmetry. 
Familiarity with this form is useful for the construction of curves of presumptive errors. 
But we must not expect to get it realised in all cases. For this reason I have considered 
it important to give, alongside of the typical curves, an example taken from real observa¬ 
tions of a skew curve of errors, which m consequence of its marked want of symmetrj 
deviates considerably from the typical form. Fig. 4 shows this last mentioned law of 
presumptive errors. 

Deviation from the typical form does not indicate that the observations are not 
good. But it may become so glaring that we are forced by it to this conclusion. If, for 
instance, between the extreme values of repetitions — abscissae -- there are intervals which 
are as free from finite ordinates as the space beyond the extremes, so that the curve of 
errors is divided into two or several smaller curves of errors beside one another, there can 
scarcely be any doubt that we have not' a series of repetitions proper, but a combination 
of several; that is to say, different methods of observation have been used and the results 
mixed up together. In such cases we cannot expect that the law of large numbers will 
remain in force, and we had better, therefore, reject such observations, if we cannot retain 
them by tracing out the essential circumstances which distinguish the groups of the series, 
but have been overlooked, 

§ 16. When a curve of presumptive errors is drawn, we can measure the magnitude 
of the ordinate for any given abscissa; so far then we know the law of errors perfectly, by 
means of the curve of errors, but certainly in the tabular form ,only, with all its copious¬ 
ness. Whethtf we can advance further depends on, whether we succeed in interpolating in 
the table so found, and^ particularly on, whether we can, either from the table or direct from 
the curve of errors, by measurement obtain a comparatively small number of constants, by 
which to determine the special peculiarities of the curve. 

By interpolating, by means of Newton's formula, the logarithms of the frequencies, 
or by drawing the curves of errors with the logarithms of the frequencies as ordinates, 
we often succeed, as above mentioned, in giving the curve the form of a parabola of low 
(and always even) degree. 

Still easier is* it to make use of the circumstance that fairly typical curves of errors 
show a single maximum ordinate, and an inflexion on each side of it, near which the 
curve for a short distance is almost rectilinear. By measuring the co-ordinate of the 
maximom point and of the points of inflexion, we shall get data sufficient to enable us to 



179 


draw a curve of errors which, as a rulet will deviate very little from the onginal AU this, 
however, holds good only of the curves of presumptive errors. With the actual ones we 
cannot operate in this way, and the transition from the latter to the former seems in the 
meantime to depend on the eye's sense of beauty. 


V, FUNCTIONAL LAWS OF EEROBS. 

§ 17. Laws of errors may be represented in such a way that the frequency of 
the results of repetitions is stated as a mathematical function of the number, or numbers, 
expressing the lesults. This method only differs from that of curves of errors in the 
circumstance that the curve which represents the errors has been replaced bj its mathema¬ 
tical formvla; the relationship is so close that it is difficult, when we speak of these two 
methods, to maintain a strict distinction between them. 

In former works on the theory of observations the functional law of errors is the 
principal instrument. Its source is mathematical speculation; we start from the properties 
which are considered essential in ideally good observations. From these the formula for 
the typical functional law of errors is deduced; and then it remains to determine how 
to make computations with observations in order to obtain the most favourable or most 
probable results. 

Such investigations have been carried through with a high degree of refinement; 
but it must be regretted that in this way the real state of things is constantly disregarded. 
The study of the curves of actual errors and the functional forms of laws of actual errors 
have consequendy been too much neglected. 

The representation of functional laws of errors, whether laws of actual errors or laws 
of presumptive errors founded on these, must necessarily begin with a table of the results 
of repetitions, and be founded on interpolation of this table. We may here be content to 
study the cases in which the arguments (i. e. the results of the repetitions) proceed by 
constant differences, and the interpolated function, which gives the frequency of the 
argument, is considered as the functional law of errors. Here the only difficulty we en¬ 
counter is that we cannot directly employ the usual Hewtonian formula of interpolation, 
as this supposes that the function is an integral algebraic one, and gives infinite values 
for infinite arguments, whether positive or negative, whereas here the frequency of these 
infinite arguments must be 0. We must therefore employ some artifice, and an obvioos 
one is to interpolate, not the frequency itself, y, but its reciprocal, L. This, however, toms 
out to be inapplicable; for L will often become infinite for finite arguments, and will, at 
any rate, increase much faster than any integral function of low degree. 



1$0 

But, as we have already said, the interpolation generally succeeds, when we apply 
it to the logarithm of the frequency, assuming that 

Log y « or -{- da? . -f 

where the function on the right side begins with the lowest poi\ers of the aigument x, 
and ends with an even power whose coefficient g must be negaftve. Without this latter 
condition the computed firequency, 

y ^ l()a4-*»+«*+ (1) 

would again become infinitely great for « «=« ^qo. That the observed frequency is often 
«= 0, and its logarithm — oo like does no harm. Of course we must leave out 
these frequencies of the interpolation, or replace them by very small finite frequencies, a 
few of which it may become necessary to select arbitrarily. As a rule it is possible to 
succeed by this means. In order to represent a given law of actual errors in this way, we 
must, according to the rule of interpolation, determine the coefficients a, 5, c,... y, whose 
number must be at least as large as that of the various results of repetitions with which 
we have to deal. This determinaiaon, of course, is a troublesome business. 

Here also we may suppose that the law of presumptive errors is simpler than that 
of the actual errors. And though this, of course, does not imply that log y can be ex« 
pressed by a small number of terms containing the lowest powers of x, this supposition, 
nevertheless, is so obvious that it must, at any rate, be tried btfore any oth^. 

§ 18. Among these, the simplest case, namely that in which Log y is a function 
of X of the second degree 

Logy *= a-\-hx — cx*, 

gives us the typical form for the functional law of errors, and for the curve of errors, or 
with other constants 

y = (2) 

where 

* - ‘+T+i^+ri3+-- - 

The function has therefore no other constants than those which may be interpieted 
as unit for the frequencies A, and as zero m and unit n for the observed values; the 
corresponding typical curve of errors has therefore in all essentials a fixed form. 

The functloiuil form of the typical law of errors has applications in mathematics 
which are almest as important as those of the ezponmitial, logarithmic, and trigonometrical 
functions. In the theory of observations its importance is so great that, though it has 
been ov^^-estimaied by some writers, and though many good observations show presuhiptive 
as well as actual laws of errors that are not typical, yet every student must make hims^ 
peri'ectly liuniliar with its properties. 



18 } 


Expanding the index we get 

e sl~/ PB e T\»J . fi *» . e *U/ , 


(3) 


so that the general function resolves itself into a product of three factors, the first of which 
is constant, the second an ordinary exponential function, while the third remains a typical 
functional law of errors. Long usage reduces this form to r"**; but this form cannot be 
recommended. In the majoritj of its purely mathematical applications is preferable, 
unless (as in the whole theory of observations) the factor in the index is to be preferred 
on account of the resulting simplification of most of the derived formulae. 

The differential coefficients of with regard to x are 

n~^(*s—n*)e"T{T)* 

— 3«*a;) e~"tin) 
fi-i — 3^8. gafS 4.1.3«<) e~'t(*) 

— «- (i»* — 5»* • 2aj® 4- 3 • 5n*x) e'’T(7) 

«- ** (jr®—5»® • -h 3 • •3a?*—1 • 3 • 5»®) 


De-Ui)': 

D*e-T©\ 


(4) 


The law of the numerical coefficients (products of odd numbers and binomial 
numbers) is obvious. The general expression of can be got from a comparison 

of the coefficients to (—?»)' of the two identical series for equation (3), one being the Taylor 
series, the other the product of s'tW and the two exponential series with and m as 
arguments, h can also be induced from the differential equation 
4. a;^ 4“ (** 4 -1) — 0. 

Inversely, we obtain for the products of the typical law of errors by powers of x 




a-V =* 

9 — 


— n*i)y> 
n*D*^ 4“ ^ V 

— — 3»®Pf 
««i>V4-6n«DV4-3nV 

4- 4- 45ji®/)*55' 4- 


(B) 


the numerical coefficients being the same as above (4). This proposition cau be demon¬ 
strated by the identical equation n-*a:*’+y * — D(a'^)-}-ra?’~‘^. 

By means of these formuls every product of any integral rational function by 

3 



162 


exponential functions and functional typical laws of errors can be reduced to the form 


where i 

<p ^ e 2 \ » / , 

and thus they can easily be differentiated and integrated. Every quadrature of (his form 
can be reduced to 


a{ * ) s( «' )djr, 

where f^{x) and f.j{x) are integral rational functions; thus a very large class ol' prohlvrus 
can be solved numerically by aid the following table of the typical or exponential 
functional law of errors, together with the table of its integral 





oo 

O0(XXK) 

1-0000 

OOOO 

o-l 

009983 

■9950 

- *100 

0*2 

019807 

*9802 

- -196 

03 

029556 

-9560 

- -287 

04 

038968 

•9231 

- '369 

06 

047993 

•8825 

- -441 

06 

066686 

•8363 

- -601 

07 

061680 

7827 

- -548 

08 

0-72227 

•7261 

~ -581 

Oil 

079194 

•6670 

- -600 

1-0 

086662 

06066 

-0607 

11 

091325 

•5461 

- -601 

1-2 

0-96488 

■4868 

- -584 

1-3 

101067 

■4296 

- -558 

1-4 

1-05089 

3763 

~ -625 

16 

1-08685 

•3247 

- -487 

1-6 

1-11690 

•2780 

— -445 

1-7 

1-14161 

•2357 

- -401 

1-8 

1-16326 

•1979 

- -366 

1-9 

1-18133 

■1645 

- -313 


1-19629 

0-1353 

-0271 

2*1 

1-20863 

•1103 

- -232 

22 

1-21846 

-0889 

- -196 

2-3 

1-22643 

•0710 

- -163 


tl 


dj 2 



dz * 

dz ^ 

dz * 

«• 

- 1-00 

0-0 

3 

.•4 

12327 T 

- -99 

■3 

3 

2‘5 

1-33775 

- -94 

■6 

3 

2-6 

1 2im 

- -87 

•8 

2 

27 

1-24462 

- 78 

1-0 

2 

2-8 

l - 24 ( i 91 

- -66 

1-2 

1 

2-9 

121864 

- -53 

1-3 

1 

3-0 

1-24993 

- '40 

1-4 

0 



- -26 

14 

-0 

31 

l - 2 o 089 

- -13 

1-3 

—1 

32 

1-23159 




3-3 

1-25210 

000 

1-2 

-1 

34 

1-26347 

-u 

1-1 

-2 

33 

1 - 262^3 

•21 

•9 

— 2 

36 

1-25292 

•30 

•7 

-2 

i 


•36 

•5 

_ o 

' 37 

1-25304 

•41 

•4 

*■ 1 

1 38 

f 

1-25313 

•43 

•2 

—2 

I 39 

1-25319 

•46 

•0 

-1 

40 

1-25323 

•44 

—-1 

- i ! 

4-1 

1 - 253 P 3 

•43 

^•2 

—1 

4-2 

1-25328 

041 

-•3 

.-1 

1 4-3 

1-26329 

•38 

-•3 ■ 

-0 

! 4-4 

1-25330 

■34 

-•4 

-0 

14-5 

1-25331 

•30 

-•4 

—0 i 

X 







% ' 

dz 

dz^ 

dz^ 

d~* 

o-onoi 

-0135 

0-27 

-0-4 

0 

•0439 

— -no 

•23 

- -4 

0 

■{>340 

- 089 

-20 

- -3 

0 

•0261 

- -071 

•1() 

— -3 

0 

•019S 

- -056 

-14 

— -3 

0 

■0149 

- -043 

•11 

- -2 

0 

00111 

-0033 

0-09 

—0-2 

0-3 

•0082 

- -026 

•07 

- -2 

0 

•0060 

- -019 

•06 

- -I 

0 

■0043 

- 014 

•04 

- 1 

0 

•0031 

- -Oil 

■03 

- 1 

0 

•0022 

- -oos 

•03 

- -1 

0 

•0015 

- *006 

•02 

*- -1 

' 0 

•0011 

- -004 

•01 

- 0 

0 

■0007 

- -003 

■01 

- -0 

0 

iWo 

— -im 

•01 

— *0 

0 

onoo;^ 

—0-001 

001 

— «v 

1 

■0002 

- -001 

•00 



•0001 

- -001 

•00 



•a)oi 

— -000 

■00 



•0001 

- -000 

•00 



■0000 

- -000 

■00 



0000 

- *000 

•00 

0 

0 









183 


Here jy, ^ are, each of them, the same for positive and negative values 

of z; the other columns of the table change signs with z. 

The interpolations are easily worked out by means of Taylor’s theorem: 

and 

The typical form for the functional law of errors (2) shows that the frequency is 

always positive, and that it arranges itself symmetrically about the value for which 

the frequency has its maximum value y For a; the frequency is y »» A >0*60653. 

The corresponding points in the curve of errors are the points of inflexion. The area 

between the curve of errors and the axis of abscissae, reckoned from the middle 

will be 0*85562; and as the whole area from one asymptote to the other is nhV^ 

^ nh • 2*50663, only nh • 0*39769 of it falls outside either of the inflexions, consequently 

not quite that sixth part (more exactly 16 per ct) which is the foundation of the rule, 

given in § 11, as to the limit between the great and small errors. 

The above table shows how rapidly the function of the typical law of errors de« 

creases toward zero. In almost sdl practical applications of the theory of observations 
1 , 

e 3' 0, if only 0 > 5. Theorelically this superior assymptotical character of the function 

is expressed in the important theorem that, for 00 , not only e~T^ itsdf is 0 

hut also all its differential coefficients; and that, furthermore, all products of this function 
by every algebraic integral function and by every exponential function, and all the differential 
quotients of these products, are equal to zero. 

In consequence of this theorem, the integral dz »» can be computed 

as the sum of equidistant values of multiplied by tiie interval of the arguments 
without any correction. This simple method of computation is not quite correct, the 
underlying series for conversion of a sum into an integral being only semiconvergeni in 
this case; for very large intervals the error can be easily stated, but as far as intervals 
of one unit the numbers taken out of our table are not sufficient to show this error. 

If the curve of errors is to gi\e relative frequency directly,, the total area must be 
1 1 . h consequently ought to be put — Jiyg* 

Problem 1. Prove that every product of typical lawa of errors in the functional 
form ^ he T\ » ^, with the same independent variable is itseir a typical law of errors. 
How do the constants A, m, and n change in such a multiplication? 

8 » 



t64 


'2. How ^mnlt an* fhe iietiuciKies o{ tMiors exceeding 1- Hme*. 

Ilu iMiMii *noi, on the ^upimsition of the typical law of errors? 
t'<‘o!*li'iii .» To find the values of the definite integrals 

\n>\\ei: 63,^1 — 0 and «<» »» I * 3 * 5... (2* — 1) tt**+‘l/2;r. 

V i‘,>. Nearly related to the typical or exponential law of errors in functional term 
vfcif tin* binomial imn lions, which are known from the coefticients of the terms of the 
puwei of a binomial, regarded as a function of the number x of the term. 


H 012 34 5 67 



1 ij 1 1 

2 I 1 2 1 

3 I 1 3 3 1 

4 j I 4 6 4 1 

5 j 1 6 10 10 6 1 

6 I 1 6 15 20 15 6 1 

7 I 1 7 21 35 35 21 7 1 

8jlfi2836 70 5628 6 

9 j 1 9 36 84 126 126 84 36 

10 'i 1 10 45 120 210 362 210 190 

11 I 1 11 65 165 330 462 4fL 330 

12 J 1 12 66 220 495 792 924 792 

13 1 13 78 286 715 1287 1716 1716 

14 I 1 14 91 364 lOOl 2fX)2 3003 3432 


For integral values of the argument the binomial function can be computed directly 
by the formula 

n (»— 1)X 4-1) ( 


When the binomial numbers for n are known, those for n + 1 are easily found 
by tho formula 

)?.+»(«) — (») 4- — 1). (10) 

by substitution .according to (9) we easily demonstrate the proposition that, for 



185 


any integral values of n, n and i 

^ (U; 

which means that, when the trinomial (a + b 4- c)* is developed, it is indifferent whether 
we consider it to be ((a-l-i) + c)'‘ or (a + (i + c))». 

For fractional values of the argument j;, the binomial function ^4^) can be taken 
in an infinity of different ways, for instance by 

sinsTf 

TtX 




This formula results from a direct application of Lagrange^s method of interpolation, and 
leads by (10) to the more general formula 

sinffic 
7:x 


nix) 


( 12 ) 


(1—a;)(2—a?)...(n— 

This species of binomial function may be considered the simplest possible, and has 
some importance in pure mathematics; but as an expression of frequencies of observed 
values, or as a law of errors, it is inadmissible because, for a; > n or x negative, it gives 
negative values alternating with positive values periodically. 

This, however, may be remedied. As has no other values than 0 and 1, 
when X is integral, we can put for instance 


a , . /sinjraj\* 


by (10) then 




(13) 


Here the values of the binomial function are constantly positive or 0, But this 
form is,cumbersome; and although for x=»oo the function and its principal coefficients 
are «= 0, this property is lost here, when we multiply by integral algebraic or by exponen¬ 
tial functions. 

These unfavourable circumstances detract greatly from the merits of the binomial 
functions as expressions for continuous laws of errors. 

When, on the contrary, the observations correspond only to integral values of the 
argument, the original binomial functions are most valuable means for treating them. That 
^4x) — 0, if j;>» or negative, is then of great importance. But this case must be referred 
to special investigations. 

§ 20. To represent non-typical laws of errors in functional form we have now 
the choice between ai least'three different plans: 



f66 


1) the formula (1) or 

2 ) the products of integral algebraic functions by a typical function or (11) 

3) a sum of several typical functions 



(14) 


This account of the more prominent among the functional forms, which we have at our 
disposal for the representation of laws of errors, may prove that we certainly possess good 
instruments, by means of which we can even in more than one form find general series 
adapted for the representation of laws of errors. We do not want forms for the series, 
required in theoretical speculations upon laws of errors; nor is the exact representation of 
the actual frequencies more than reasonably difficult. Jf anything, we have too many forms 
and too few means of estimating their value correctly. 

As to the important transition from laws of actual errors to those of presumptive 
errors, the functional form of the law leaves us quite uncertain. The convergency of the 
series is too irregular, and cannot in the least be foreseen. 

We ask in vain for a fixed rule, by which we can select the most important and 
trustworthy forms with limited numbers of constants, to be used in predictions. And even 
if we should have decided to use only the typical form by the laws of presumptive errors, 
we still lack a method by which we can compute its constants. The answer, that the 
“adjustmenf of the law of errors mnst be made by the '^method of least squares", may 
not be given till we have attained a satisfactory proof of that method; and the attempts 
that have been made to deduce it by speculatipns on the functional laws of errors must, 
I think, all be regarded as failures. 


VI. LAWS OF ERROES 

EXPRESSED BY SYMMETRICAL FUNCTIONS. 

§ 21. All constants in a functional law of errors, every general property of a 
curve of errors or, generally, of a law of numerical errors, must be symmetrical functions 
of the several results of the repetitions, i. e. functions which are not altered by inter¬ 
changing two or more of the results. For, as all the values found by the repetitions 
correspond to the same essential circumstances, no interchanging whatever can have any 
infiuence on the law of errofs. Conversely, any symmetrical function of the values of the 



187 


observations will represeai some property or other of the law of errors. And we must be 
able to express the whole law of errors itself by every auch collection of symmetrical 
functions, by which every property of the law of errors can be expressed as unambiguously 
as by the very values tound by the repetitions. 

We have such a collection in the coefficients of that equation of the degree, 
whose roots are the n observed values. For if we know these coefficients, and solve the 
equation, we get an unambiguous determination of all the values resulting from the repe¬ 
titions, i. e. the law of errors. But other collections also fulfil the same requirements; the 
essential thing Ls that the n symmetrical functions are rational and integral, and that one 
of them has each of the degrees 1,2 ... ft, and that none of them can be deduced from 
the others. 

The collection of this sort that is easiest to compute, is the sums of the powers. 
With the observed values 


we have 

Sj 0, -[■ "t" • • 4* 

=»»» -f- oj . -f* 0» 


8f. 0^ 0 j -j- . . Or 


(15) 


and the fractions ^ may also be employed as an expression for the law of errors; it is 
only important to reduce the observations to a suitable zeio which must be an average 
value of 0 ^ ... 0 ^; for if the differences between the observations are small, as compared 
with their differen<*e3 from the average, then 




may become practically identical, and therefore unable to express more than one prope)rty 
of l.he law of errors. 

From a well known theorem of the theory of symmetrical functions, the equations 
1 \-(t^a} I ... "s* (I — Ojfi;) (1Ojjo;) ... (1— Onto) 

^ gI^log(1^0^ai) 

which are identical with regard to every value of w, we learn that the sum of the powers 
Sr can be computed without ambiguity, if we know the coeificients Ur of the equation, 
whose roots are the » observations; and vice vers5, by difterentiating the last equation 




(86 


with regard to w, and eijuating the toeffieients w'e get 

0 =• tti 4- 
0 ^ gtta 4" “i^i + 


(1C) 


0 *=« wflu 4“ "[“••• 4“ Hh *'*» 

from will eh the coefficieiits are unainbignoasly ami very easily computed, when the 
are directly calculated. 

§ 22. Hut from the sums of powers we can easily compute also another service¬ 
able collection of symmetrical functions, which for brevity we shall call the hfilf-hwantmiit. 
Starling from the sums of powers afi these can be defined as /i,, by the 

(u]uation 

J- A** #.» _ 1 _ ..a A A A 

(H) 


So« li- >11?. 


I *1 _ I *2 t _a 


which we suppose identical with regard to r. 
As Sr “» iV, this can be written 


*o.Tr' + ¥’’ + ¥^ + - - - 


( 18 ) 


By developing the first term of (17) as Shtr and equating the coefficients of each 

power of T, we get each — expressed as a function of /i, .. .«ri 
«o 


«i 

«» — (>wJ + (l\) 

h (/^, -f f/<J) 

-= h (Mi + 4- f 4-/^!) 


(19) 


Taking the logarithms of (17) we get 

-log(l + 

and hence 


''^>1 J.1* ^a.£» ^ 

li ^0 > ^r. I> 


4-...) 


( 20 ) 


Ml — *i*^n 

//, =- («,s« —«!);«o 
M, (^^«~3s,«i«n + 2a;):/t» 

«• (j!^ «* ^ iff, s, si — 3f»5 H- 8\ «„ — lisj): si 


( 21 ) 


The general law of the relation between the /4 and s is more easily understood 
through the equations 






J89 


'^1 /^ifio 

*2 “ 4- 

»» — /i 1.% + 2/i4s, 4 ^2, .Sg (22) 

S 4 «*= •'^a 4 %tt ‘^8 4 3/ia*'^i 4’/^4 '^0 


where the numerical coefficients are those of the binomial theorem. These equations can 
be demonstrated by differentiation of (17) with regard to r, the re.suHing equation 


f 8^+... 




being satisfied for all values of r by (22). 

These half-invariants possess several remarkable properties. Prom (18) we get 


«oe!! ^I» 


(0,-/i,}T , (Ol-o- 

e +...4-« 


/*i)t 


(24) 


consequently any transformation 0 ' «»» 0 4 e, any change of the ;?ero of all observations 
0 , ...o„, affects only / 2 j in the same manner, but leaves /tij, ... unaltered; any 

change of the unit of all observations can be compensated by the reciprocal change of the 
unit of r, and becomes therefore indifferent to ... 

Not only the ratios 

*0 *0 •'*0 


but also the half-invariants have the property which is so important in a law of errors, 
ol remaining f.'»r*hanged when the whole series of repetitions is repeated unchanged. 

We ha\e set‘n that the typical character of a law of errors reveals itself in the 
elegant functional form 

^(jr) « e“t\n ) . 

Now we shall see that it is fully as easy to recopize the typical laws of errors by means 
of their half-invariants. Here the criterion is that /ir«0 if r>3, while / 2 j —<n and 
^ 2 j ««, This remarkable proposition has originally led me to prefer the half-invariants 
to every other system of symmetrical functions; it is easily demonstrated by means of (5), 
if uc take for the zero of the observations. 

Wo begi)i by forming the sums of powers Sr of that law of errors where tlie fre- 
quoncy oi* an observed » is proportional to ^ ; as this law is continuous 

we get 

Sr — \jr‘'^{x)dx. 



ISO 


For every differential coefficient wo have 



/>‘-^^(oo) — co) 


consequently we learn from (5) that » 0, but 

1 

—» 1 ‘S'W^Sq 

SB l*3‘5*w®So 


0 , 


(compare problem 3, § 18). Now the half-invariants can be found by (22) or by (17). If 
we use (22) we remark that s^r — w* (2r —l)s 2 ,.-. 2 ; then writing for (22) 


h • — /<i«o — ^ 

«s — /igSo — — 0 

*“ 4 ” 0 

»" /^i^s "1" 4* ^ 

^5 —-^,^8 ^ 4- 6/£aSs 4 4 A<6*'»n “** 0 

*0 — 5/4s« 4 — /^r’’6 'f* 10/i»^a 4- 10;i4«2 4 5/4ftSi 4 A«fi«o “= 0 

we see that the solution is *■» and « 0. 

By (17) we get 


2tlL* 

Equating the coefficients of t'* we get here also /ij «* 0 =?» in, n*, f4r^0 
if r >3. 

If we wish to demonstrate this important proposition without change of the zero, 
and without the use of the equations (3) whose general demonstration is somewhat diffi¬ 
cult, we can commence by the lemma that, for each integral and positive value of r, and 
also for r — 0, we have for the typical law of errors 

Sr+l -= msr 4-^»*Sr-i. 

* £/*—wX* 

The function ^{x) n*afs~ *1 » / is equal to zero both for a; —oo and for j* — xj 
if we now between these limits integrate its differential equation 

“ (r?i*af"-* — (a;—w)af)«■*■*(”ir), 

we get 


0 «■ — Sr^i -|- ms, 4 r »* ■. 




191 


whero 



If we now from (22) subtract, term by term, the equations 

WWq 

Sa =» msi 4- ^*^0 

Jfg •=» MSj, 4* 2rt*i>', 

^4 5= ma 4- 3»*S8 


it is obvious that 0, /£, «« w*, /^a -»/«4 «=...»«» 0. 

By computation of /xi and we find consequently, in the simplest way, the 
constants of a typical law of errors. 

If the law of errors deviates only a little from the typical form, etc., will 

also, all of them, be relatively small numbers; and each of them may be either positive 
or negative. 

On the whole, a law of errors can be determined without ambiguity by the values 
fit, ^ being the number of repetitions. From any such fits we can compute 

the sums of the powers s unambiguously, and from these again the coefficients of the 
equation whose roots are the observed values. 

But for real laws of errors it is a necessary condition that no imaginary root 
can be admitted. If an infinite number of repetitions is considered, the equation ceases 
to be algebraic, and then the convergency of the series necessary for its solution is & 
further condition. 

§ 24. 2%e mean value ^ is always greater than the 

least, less than the greatest of the observed values Oi, o^,... o*; under typical ckcam- 
stances we shall find almost the same number of greater and less values of the observations. 
The majority of them lie rather near to /t,; only few very distant from it. The mean 
value is the simplest representative of what is common in a series of values found 
repetition; its application as such is most likely exceedingly old, and marks in the history 
of science the first trace of a theory of observations. 

The mean deviation^ whose square is -—/ug, measures the magnitude of the devia¬ 
tions, the uncertainty of the repeated actual observations. The square of the mean deviation 
is the mean of the squares of the deviations of the several observations from their mean 
value. By addition of 


4 * 






(o, —/£.)’ — o! —+ 
K—/ai)’ “=» — 2w,/^,+/i; 


we get 

and 08 — 


K—/I,)’ — oi —2o„/i,+//’| 
ju,)* ^ 4 

•^'(q-x£.)* V, -«! 

i; —ij— 


(25) 


The compuiatiott of by this formula will often be easier than by the equation 
(21)f because s, in the latter roust frequently be computed with more figures. There is 
however a middle course, which is often to be preferred to either of these methods of com¬ 
putation. As a change in the fsero of obsenaiionH involves the same increase of every 
n and of /zj, it will, according to (24), have no influence at all on We select therefore 
as aero a convenient, round number, o, very near and by reference "to this zero the 
observed values are transformed to 


0, —c. 


0, — c, 


o» 


0, — <J. 


we have /z, •«> c 4 - 


When e\ and s[ indicate the sums of the transformed observations, and then 

X{o-c) 


- and 


(26) 


We have still to mention a theorem concerning the mean deviation, which, though 
not useful for computation, is useful for the comprehension and further development of the 
idea; The square of the mean deviation is equal to tlie sum of squares of tlie diiferenoe 
between each observed value and each of the others, divided by i.wice the square of the 
number. The said squares are: 

(«i - Oi)S (ot — «i)*i • • • • (o* ~~ 

(Oj- «s)*, (O, - 0*)*,-(<*»'-* 05l)*i 


(0, • - (0* ™ 0,)*; .... (0. ~ 0.)*; 

developing each zif these by the formula (o*— <»■)* p»* oi, and first adding wich 

column separately, we find the sums 





»93 




t(JiH^'2s^0n- A, 

}ind Liiu huui ui' Uiose 

V*a — "h ^3^*i **• ^ ■” ^l)' 

VJons(M(uonlly, 

mor-o^y - 25?//,. m) 

The mean doviaiion is greater than the least, less than the greatest of the deviatiens of 
the values of repetitions from the moan number, and less than V\ of the gi*eatest deviation 
between bvo observed values. 

As to the higher half-invariants it may here be enough to state that they indicate 
various sorts of deviations from tlie typical form. Skew curves of error? are indicated by 
the / 4 jr+i being diiferent from zero, peaked or llattoned (dindod) forms respectively by 
positive or negative values of iMr% and inversely by /^ 4 r+ 2 * 

For iihese higher half-inmiants we shall propose no special names. But we have 
already introduced double names “relative frequency*’ and “probability” in order to accen¬ 
tuate the distinction between the laws of actual errors and those of presumptive errors, 
and the same wo ought to do for the half-invariants. In what follow’s we shall indicate 
the half-invariants in laws of presumptive errors by the signs Xr instead of fir which will 
be reserved for laws of actual eirors, particularly when we shall treat of the transition 
from laws ol actual errors to those of presumptive ones. For special reasons, to be 

explained later on, the qaino moan value can be used \ntliout confusion both for and 

aclaial \ veil as for presumptive moans; but instead of “mean deviation” wo say 
•‘uii'un error”, wlion wo spoak of laws of presumptive errors. Thus, if 

X^ Lini«,«(/z.,) 

is called Ihe square el I he mwin error. 

In speculalions upon ideal laws of errors, when the laws are supposed to be con¬ 
tinuous or to vetaie lo iiiiiuite numbers of observations, this distinction is of course 

insigiiiticant. 

Examples: 

1 . IVelessor Jul. Thomsen found for Iho constant of a calorimoter, in experiments 
with pnvo wal^M’, in seven repiditioiis, the values 

2(140, 2(»47, 2(145, 2(>5n, 2(»5a, 2<l4ti, 2d40. 

If »e hero 2(150 srs zoro, wo resui the observations iis 

- I, -3, --5, +3i +3, --4, -1 




194 - 


so that 
conbeqiientlv 


a' » 7, .s' --H, and ^ 70; 


= 2fi50-;- = mo. 

= 0 . 


The mean deviation is consequently 

2. In an alternative experiment the result is oiilior “yes”, whith counts 1, or 

“no”, which counts 0. Out of w + » repetitions the m have given “yes”, the >* “no”. 

What then is the expression for the law of errors in half-invariants? 

. in mn mn(n^m) mn{m*n-) 

nswer. ^ * /'a “* -j. ^jjs ’ 


3. Determine the law of errors, in half-invariants, of a voting in which a voters 
liavo voted for a motion (-{-1), c against (— 1), while have not voted (0), and examine 
what values for«, and c give the nearest approximation to the typical form. 

_ a —c ad 4-4ca-i‘6e _ (c — a) (al» + 8ca + —&*) 

“■ o + i + c’ “ “(T-p'+f)*"’ ” (a + f+c)! 

_ ((« + c)(u+i+c) -4(«-<!)»)(»+t 4 e)(2a-i+3i) + 6(a-e)* 

(« + i + ®)‘ 

Disregarding the case when the vote is unanimous, the double condition ft^ 

<»-0 is only satisfied when one sixth of the votes is for, another sixth against, while two 
thirds do not give their votes. It fi^ is to be»oO, without a being »»c, — 

—8f(c must be»»-0. But then— 2//* , which does not disappear unless 

two of the numbers a, b, and c, and consequently /u^, are >«= 0. 

4. ISix repetitions give the quite symmetrical and almost typical law ht errors, 

fii « 0, /ij — J, 0, but — J. What are the observed values? 

Answer: — 1, 0, 0, 0, 0, -f 1. 


Vn. KBLATIONS BETWEEN FUNCTIONAIi LAWS OF EBIM)RS 
AND HALF-INVARIANTS. 

^ 24. The multiplicity of forms of the laws ot eirors makes it impossible to write 
a Theory of Observations in a short manner. For though those foims are of very dillereul 
value, none of them can be considered as absolutely superior to the others. The functional 
form which has been universally .employed hitherto, and by the most prominent writers, has 
in my opinion proved insufficient. I shall here endeavour to replace it by the half-invariants. 



195 


But ovon if I Rhould succeed in this endeavour, t am sure that not only the functional 
laws of errors, but even the curves of errors and the tables of frequency ai-e t/m imiK>rtant 
and natural to be put comploioly aside without detriment. 

Moreover, in pro}>oaing a now plan for this theory, 1 have felt it my duty to explain 
as precisely and completely as possible its relation i.o the old and commonly known methods. 
1 therefore consider it a matter of gi‘eat importance that even the half-invariants, in their 
very definition, pre.sent a natural transition to the frequencies and to the functional law 
of errors. 

If in the pqnation (IS) 


some of the o/s are exactly repcatc^d, it is of course understood that the term must 
be counted not once hut as often as o, is repeated. Conscqnenily, this definition of tl)e 
half-in variants may, witliout any (‘hange of sense, be .writlon 


■ e a t ■ l» • "• ■ • — 2,VH (3») 

where the frequencies ^(o/) are given in the form of the functional law of errors. For 
continuous laws of errors tiie definition must be written 

T 4 . ^2 T* 4 . T» 4 . . i*'*'*’* 

, 14 + I* ‘ T’ TT ^ ^ * r j- f .1. /euw 




S2T*4,tiT*4- . 


Thus, if we know the functional law of errors and if we can perform the integrations, the 
half invariants mnv be found* lfi inversely, we know the A/, then it may be possible also 
to determine tlie itiiictional law of errors ^(o). 

Kxample 1. Lot tp (o) ho a sum of typical functional laws of errors, 

p { 0 ) ca 27/,0 ’ "• ' , 

.» > ** 

tiien \ <r (o) |/2ff 2 Vj,/// and 




„ Jrfo, 


and cimserinently 

' ■ ”* £li.», ' ■ 



196 


By aid of the formulae (19) that express ~ as functions of the X (or fit) it is not 
difficult to compute the principal half-invariants, The inverse problem, to compute the 
and h{ by means of given half-invariants is very diffictilt, as it results in equa¬ 
tions of a high degree, even if only a apm of two typical functional laws of errors is 
in question. 

Example 2. What are the half-invariants of a pure binomial law of errors? The 
observation r being repeated fin{r) times, we write 

+ •• •= /9.(0) + fi.0.)^ + ■ • •+- (l+'T*. 

consequently 

f7*-f •••■“» log cos , 

Here the right hand side of the equation can be developed by the aid of Bernoullian 
numbers into a series containing only the even powers of r* consequently 

^**‘'*-* 

further 

« » It 17 81 

/'a 4 » /^4 ““ g t '4'» /<8 “ —/^lo -j-w* • • • 


Example 3. What are the half-invmiants of a complete binomial law of errors 
(the complete terms of (p-f^)*)? Here 

Prom this we obtain by differentiation with regard to r 


by further differentiation 






nqer 

_ _Lhiil 

dx* 


Pi 

P-i 

P» 


”P 

p-^q 


ip f 7)* 
npqjp^ q) 


putting r — 0 we get 



197 




7ipq 

lf+9f 

(P+9)' 


iJ(p-9y 


((; 


'inlY 

P+9/ 


(J’+S)*/ 

8pq(p-q)\ 

(?+■# V 


Example 4. A law of presunniHve errors is given by its half-invariants forming 
a geometrical progression, Xr «= ba*". Determine the several observations and their frequencies. 
Here the loft hand side of the equation (18) is 






but this is + also the form of the right 

side of (38). Thus the observed values aro^O, o, 2a, 3a, ... and the relative frequency of 

jjf 

ra is -p ^(r). This law of errors is nearly related to the binomial law, which can be 
consideTod as a product of two factors of this kind, 


|r ’ f9t-~r 


I" 




It is perhaps superior to the binomial law as a representative of some skew laws 
of errors. 

Example 5. A law of errors has the peculiarity that all half-invariants of odd 
order are -«• 0, while all even half-invariants are equal to each other, Xsr •— 2a. Show 
that all the observations must be integral numbers, and that for the relative frequencies 

p(0)-..-(n-(|-)’+(!)’ + ...) 

f>(±r) - + + + 

Example (>. Determine the half-invariants of the law of presumptive errors for the 
irrational values in the table of a function, in whose computation fractions under ^ have 
been rejected and those over ^ replaced by 1: 


^*r+i **■ 0, Aa ^ rV» 


"lio’ ** vifft 


25. As a most general functional form of a continuous law of errors we have 
proposed (G) 

i /*—»t\* 

whore e^'SK'V'J . 


5 




196 


Now it is a very remarkable thing that we express the half-mvariants without 
any ambiguity as functions of the coefficients A*,, and vice versa. 

By (29) we get 






where s^, By means of the lemma 

le‘‘^D'p(o)do = e^{(J>-'^(o)-rI)'-‘^(o) + ..-i {-T)'-‘ji«(o)}+ 

which is easily demonstrated for any ^(o) by differentiating with regard to o only, we 
have in this particular case, where ^(o) and every I>^{o) is = 0, if 

^^iTe-rl^fdo = =- (-Tr»V'2se""+l’^. 

tLoa V~w 


Consequently, the relation between the half-invariants on one side and the coofiicients k 
of the general functional law of errors on the other, is 


-4 , ^-J . 

* +77 T j* ‘ +••■ 


Ke\i ^\1 




(30) 


If we write here X\ and the computation of one set of 

constants by the other can, according to (17), be made by the formulae (19) and (21). We 
substitute only in these the k for the and Ji' or for 

It will be seen that the constants m and and the special typical law of errors 
to which they bebng, are generally superfluous. This superfluity in our transformation 
may be useM in special cases for reasons of convergency, but in general it must be con¬ 
sidered a source of vagueness, and the constants must be fixed arbitrarily. 

It is easiest and most natural to put 

m « ^ and ^ /? 5 . 

In this case we get =0, =“0, K and further 


The law of the coefficients is explained by writing the right side of equation (30) 

!«*+ f -tort 




199 




Expressed by halt-invariants in this manner the explicit form of equation (6) is 


-L Kirill! 


+ ^ - 6-i, + 3-J;) 4- 

4 12^, 10;.(r-^)>4-15^|(^-^,))4 ... 


(31) 


VIII. LAWS OF ERROES OF FUNCTIONS OF OBSERVATIONS. 

§ 26. There is nothing inconsistent with our dehnitions in speaking of laws of errors 
relating to any group of quantities which, though not obtained by repeated observations, 
have the like property, namely, that repeated estimations of a single thing give rise, owing 
to errors of one kind or other, to multiple and slightly differing results which are jtrima 
fade equally valid. The various forms of laws of actual errors are indeed only summary 
expressions for such multiplicity; and the transition to the law of piesumptivo errors 
requires, besides this, only that the multiplicity is caused by fixed but unknown circum¬ 
stances, and that the values must be mutually independent in that sense that none of the 
circumstances have connected some repetitions to others in a manner which cannot be 
common to all. Compare § 24, Example C. 

It is, consequently, not difficult to define the law of errors for a function of one 
single observation. Provided only that the function is univocal, we can from each of the 
observed values Oj, ... o« determine the corresponding value of the function, and 

fioi). Ao,), ...fW 

will then be the series of repetitions in the law of errors of the function, and can be 
treated quite like observations. 

With respect, however, to those forms of laws of errors which make use of the 
idea of frequency (probability) w^e must make one little reservation. Even though o< and 
Ok are different, we can have f(o,) — and in this case the frequencies must evidently 
be added together. Here, however, we need only just mention this, and remark that the 
laws of errors when expressed by half-invariants or other symmetrical functions are not 
inlluencod by it. 

Otherwise the frequency is the same for f{ot) as for o,, and therefore also the 
probability. The ordinates of the curves of errors are not changed by observations with 
discontinuous values; but the abscissa Oi is replaced by /*((),), and likewise the argument 
in the functional law of errors. In continuous functions, on the other hand, it is the 
areas between corresponding ordinates which must remain unchanged. 


5 * 



&00 


In the form of symmetrical functions the law of errors of functions of observations 
may be computed, and not only when we know all the several observed values, and can there¬ 
fore compute, for each of them, the corresponding value of the function, and at last the 
symmetrical functions of the latter. In many and important cahes it is sufficient if we 
know the symmetrical functions of the observations, as we can compute the symmetrical 
functions of the functions directly from these. For instance, if /(o)«« o®. then the 
sums of the powers s', of the squares are also sums of the powers Sm of the observations, 
if only constantly i'o s\ =*= 54 , etc, 

§ 27. The principal thing is here a proposition as to laws of errors of the lUim,- 
functions by half-invariants. 

It is almost self-evident that if ao-j-h 

f‘'i -= <¥t + ^ \ 

p't - “Vs 

” “Va ( (32) 

etc. 

ft‘r -= {r > 1 ) 

For the linear functions can always be considered as produced by the change of 
both zero and unity of the observations (Compare (24)). 

However special the linear function ao -j- ^ be. we always in practice manage 
to get on with the formula (32). That we can succeed in this is owing to a happy 
mrcumstance, the very same as, in numerical solutions of the problems of exact mathematics, 
brings it about that we are but rarely, in the neighbourhood of equal roots, compelled to 
employ the formula for the solution of other equations than those of the first degree. 
Here we are favoured by the fact that we may suppose the errors in good observations 
to be small, so small — to speak more exactly — that we may generally in repetitions 
for each series of observations 0 ^, ... o« assign a number c, so near them all that 

the squares and products and higher powers of the differences 
Oj ”” (?, Oj C, ... O5 —” c 

without any perceptible error may be left out of consideration in computing the function: 

. L e., these differences are treated like differentials. The differential calculus gives a definite 
method, in such circumstances, for transforming any function f{o) into a linear one 

The law of errors then becomes 




20t 


But also by quite elementary means and easy artifices we may often transform 
functions into othors of linear form. If for instance f(o) = -1, then we write 

^ 1 c-(o~c) 

0 C — c) c® — (o — cy c c*' ’ 

and the law of errors is then 

^‘(t) “ t 

^*( 4 ) ” 



§ 28. With respect to functions of two or more observed quantities we may also, 
in case of repetitions, speak of laws of errors, only we must define more closely what we 
are to understand by repetitions. For then another consideration comes in, which was out 
of the question in tho simpler case. It is still necessary for the idea of the law of errors 
of f{Oi o') that we should have, for each of the observed quantities o and o', a series of 
statements which severally may be looked upon as repetitions: 

Oii Oj, . Om 

o'l, o'„ . ..On. 

Bui hero this is not sufficient. Now it makes a difference if, among the special 
circumstances by o and o', there are or are not such as are common to observations of the 
different series. We want a technical expression for this. Here it is not appropriate only 
to s}'oak of observations which are, respectively, dependent on one another or independent; 
we are led to mistake the partial dependence of observations for the functional dependence 
of exact quantities. 1 shall propose to designate these particular interdependences of 
repetitions of different observations by the word “bond", which presumably cannot cause 
any misunderstanding. 

Among the repetitions of a single observation, no other bonds must be found than 
such as equally bind all the repetitions together, and consequently belong to the peeularities 
of the method. But while, for instance, several pieces cast in the same mould may be 
fair repetitions of one another, and likewise one dimension measured once on each piece, 
two or more dimensions measured on the same piece must generally be supposed to be 
bound together. And thus there may easily exist bonds which, by community in a cir« 
cumstance, as here tho particularities in the several castings, bind some or all the repe¬ 
titions of a series each to its repetition of another observation; and if observations thus 
connected are to enter into the same calculation, we must generally take these bonds into 
account. This, as a rule, can only be done by proposing a theory or hypothesis as to the 




20Z 


mathematical dependence between the observed objects and their common circuiiistanoe, 
and whether the number which expresses this ia known from observation or quite unknown, 
the right treatment falls under those methods of adjustment which will be mentioned 
later on. 

It is then in a few special cases only that we can determine laws of errors for 
functions of two or more observed quantities, in ways analogous to what holds good of a 
single observation and its functions. 

If the observations o, o', o" ..., which are to enter into the calculation of 
f(o, o\ o", ...), are repeated in such a way that, in general, o<, ol, Om ... of the 
repetition are connected by a common circumstance, the same for each t, but other wise 
without any other bonds, we can for each i compute a value of the function yt <»■ 
f(Oi, o', ...), and laws of errors can be determined for this, in just the same way as 

for 0 separately. To do so we need no knowledge at all of the special nature of the bonds. 

§ 29. If, on the contrary, there is no bond at all between the repetitions of the 
observations o, o\ o", ... — and this is the principal case to which we must try to reduce 
the others — then we must, in order to represent all the equally valid values of y 
jf(o, o', o", ...), herein combine every observed value for o with every one for o', for o", 
etc., aud all such values of y must be treated analogously to the simple repetitions of one 
single observed quantity. But while it may here easily become too great a task to com¬ 
pute y for each of the numerous combinations, we shall in this case be able to compute 
y’s law of enors by means of the laws of errors for o, o\ o" ... 

Concerning this a number ot |repositions might be laid down; but one of them 
is of special importance and will oe almost sufficient for us in what follows, viz,, that 
which teaches us to determine the law of errors for the sum 0 oi the observed quantities 
0 and o'. 

If the law of errors is given in the form of relative frequencies or probabilities, 
f (o) for 0 and ^(o*) for q\ then it is obvious that the product f (o)^(o') must be the fre¬ 
quency of the special sum 04 - 0 '. 

In the calculus of probabilities we shall consider this iorm more closely, and there 
some cases of bound observations will find their solution | here we shall confine ourselves 
to the treatment of the said case with half-invariants. 

If 0 occurs with the observed values 

•Oj, a,, .. .. 0 * 

and 0 * with 

o\. 0 ;, . . 0 ;, 

then by the mn repetitions of the operation 0^o~\-o* we get: 



ao3 


0, + o'l* 

+03’ • • 

. . . 0, + oL 

o.+o'., 

Oj -j- 0', . . 

... o,+o; 

Om + , 

1 -|- o',, . . 

... o„+o'. 


Indicating by Mr the half-invariants of the sum we get by (18) 


where m and n are the numbers of repetitions of o and o'. Consequently, if /ir represent 
the half-invariants of o, and of o', we get 






and finally 


M, fA, +/, 
Mr «=» /£r 


(34) 


Employing the equation (17) instead of (18) we can also obtain fairly simple 
expressions for the snms of powers of (o>f o') analogous to the binomial formula. Bui the 
extreme simplicity of (34) renders the half-invariants unrivalled as the most suitable sym¬ 
metrical functions and the most powerful instrument of the theory of observations. 

More generally, for every linear function of observations not connected by any bond, 

0 *» a + 6o -j- co' -j-... do'", 


we obtain in the same manner and by (32) 

Mi (o) •** fl 4- bfiti 4“ 4* • • • “h 

mao) - + 


M^ (o) dy;! 

r>l. 


(35] 


When the errors of observation are sufficiently small, we shall also here generally 
be able to give tho most different functions a linear form. In consequence of this, th< 
propositions (34) and (35) acquire an almost universal importance, and afford nearly th< 
whole necessary foundation for the theory of the laws of errors of functions. 

Example 1. Determine the square of the mean error for differences of the n"** 
order of equidistant tabular values, between which there is no bond, the square of th< 
mean error for every value being »> Ag. 









BOl 


cat Oq) «» 2^2 

^2 W - h(o,-^^o, + o,) ^ W, 

^2 {^^) *™ JI 2 (^^8 "h^^l "^®o) “ ^^^3 

~ ij ( 0 ^ — 4o3-[-6o2”“4oj-|“O o) ** 70^2 


UJn)-^ L^.IVA l!Lj; 

“ 1 2 3 4* n 


Example 2. By the observation of a meridional transit we observe two quantities, 
viz. the time, when a star is covered behind, a thread, and the distance, f, from the 
meridian at that instant. But as it may be assumed that the time and the distance are 
not connected by a bond, and as the speed of the star is constant and proportional to the 
known value sin i) {p =« polar distance), we always state the observation by the one quan¬ 
tity, the time when the very meridian is passed, which we compute by the formula 0 «« 
i+f cosec p. 

The mean error is 

l,{o) « + 

Example 3. A scale is constructed by making marks on it at regular intervals, 
in such a way that the square of the mean error on each interval is jlj. 

To measure the distance between two objects, we determine the distance of each 
object from the nearest mark, the square of the mean error of this observation being 
How great is the mean error in a measurement, by which there aie » intervals between 
the marks we use? 

jl, (length) « nX, 4- 2X \. 

Example 4. Two points are supposed to be determined by bond-free and equally 
good (JI 2 1) measurements of their rectanplar co-ordinates. The eriors being small in 
proportion to the distance, how great is the mean error in the distance Jf 

Example 5. Under the same suppositions, what is the mean error in the inclina¬ 
tion to the avails ? 


Example 6. Having three points in a plane deteiminod in the same manner by 
their rectangular co-ordinates (a;„yo), find the moan error of the angle 

at the point (JUi^yi) 

1 (V\ «= 


Ji, Jg, Jj being the sides of the triangle; opposite to (x^,y^). 




^05 

Examples 7 and 8 . Find the mean errors in determinations of the areas of a 
triangle and a plane ({uadrangie. 

l, (triangle) - j (j; + j; 4 . jj); (quadrangle) — ) (+ J’\. 

§30. Non-linear functions of more than one argument pieseni very great ditficulties. 
Kven for integral rational functions no general expression for the law of errors can be found. 
Noveltheless, even in this case it is possible to indicate a method for computing the half- 
invariants of the function by means of those of the arguments. To do so li seems indis¬ 
pensable to transform the laws of errors into the form of systems of sums of powers. If 
0 ^ f{o, be integral and rational, both it and its po.wers O'" can be written as 

sums of terms of the standard form • oK ..and for every such term the sum 
resulting from the combination of all repetitions is kSa*Sb,,.sT^ (including the cases 
where a or h or d may be« 0 ), being the sum of all powers of the repetitions of 
dt'). Thus if Sr indicates the sum of the powers of the function 0, we get 

Sr •** 2’^Sa • si ... 

Of course, this operation is only practicable in the very simplest cases. 

Example 1 . Determine the mean value and mean deviation of the product oo' — 0 
of two observations without bonds. Hero S^^s/^ and generally Sr s^s^, consequently 
the mean value and 

•W, — 

il/, already takes the cumbersomo form 

Example 2, Express exactly by the half-invariants of the co-ordinates the mean 
value and the mean deviation of the square of the distance r- if x and y are 

ohseued without bonds, Here 

«,((•*) - *,(•'•)‘'o('/) + «o(*’)»8W 

*,()■«) -.»,(/')'•o(i') 1-2*,('•)*, (y) 

and 

//»(r«) « /la (v) -|- (/ij (.c))* [ ft, iy) + (/i, (//))* 

h fit (//) + (y)//, iy) + ^ W 

§ 31. The most important application of proposition (35) is certainly the deter¬ 
mination of the law of errors of the moan value itself. The mean value 

fii +..•««) 



206 


is» we know, a linear function of the observed values, and wo may treat the law of errors 
for //j according to the said proposition, not only where wo look upon as per¬ 

fectly unconnected, but also where we assume that they result from repetitions made 
according to the same method, For, just like such repetitions, must not have 

any other circumstances in common as connecting bonds than such as bind them all and 
characterise the method. 

As the law of presumptive errors of Oj is just the same as for ...o,*, with the 
known half-invariants ;ii, Jl„ .,. /Ir..we get according to (35) 


i- 




and in general 






(37) 


While, consequently, the presumptive mean of a mean value for m repetitions is 
the presumptive mean itself, the mean error on the mean value is reduced to ^ of 
the mean error on the single obsenation. When the number m is large, the formation 
of mean values consequently reduces the uncertainty considerably; the reduction, however, 
is proportionally greater with small than with large numbers. While already 4 repetitions 
bring down the uncertainty to half of the original, 100 repetitions are necessary in order 
to add one signidcant figure, and a million to add 3 figures to those due to the single 
observation. 

The higher half-invariants of are reduced still more. If the /Ig, A,, etc., of 
the single observation are so large that the law of errors cannot be called typical, no very 
great numbers of m will be necessary to realise the conditions ^a(/^i) ^ 

an approximation that is sufficient in practice. It ought to be observed that this reduction 
is not only absolute, but it holds good also in relation to the corresponding power of the 
mean error for (37) gives 

I i-r I 

which, for instance when 4, shows that the deviation of >(3 from the typical form 
which appears by means of only 4 repetitions, is halved; that of is divided by 4, that 
of is divided by 8 , etc. This shows clearly the reason why we attacJi gmt mprtana 
to ike typical form for lato of errors and make arrangements to abide by it in practice. 
For it appears now that toe possess in the formation of mean values a means of tnaking 
the laws of errors typicdlf even where they were not so originally. Therefore the standard 
rule for all practical observations is this: Take care not to neglect any opportunities of 



207 


rfi))oatin(> obsciNationb iUid piirib ui obb'emtions, so that you can directly form the mean 
values which bhould bo siibslituled foi the observed results; and this is to be done espe¬ 
cially in the case of observations of a novol character, or nith peculiarities which lead us 
to doubt whether Iho law of errors will be typical 

This remarkable property is peculiar, however, not to the mean only, but also, 
though with less certainty, to any linear function of several observations, provided only 
the coefficient of any single term is not ho great relatively to the corresponding deviation 
from the typi('al form that it throws all the other terms into the shade. From (35) it is 
seen that, it the laws of errors of all the observations o, o\ ... are typical, the law 
of errors lor any of their linear functions will be typical too. And if the laws of errors 
are not typical, thou that of the linear function will deviate relatively less than any of the 
observations o, o', ... 

To avoid unnecessary complication we represent two terms of the linear function 
simply by o and o'. The deviation from the typical zero, which appears in the half¬ 
invariants (/* > 2), measured by the corresponding power of the mean error, will be less 
for 0«o4-o' than for the most discrepant of the terms o and o'. 

The inequation 



says only that, if the laws of errors for o and o' deviate unequally from the typical form, 
il is the law of errors for o that deviates most. But this involves 



or more briefly 

r > B\ 

where T b positive, r > 2. 

When we introduce a positive quantity U, so that 


r^u*> R\ 


it is evident that (U^ 1 )^ S f 1 )*» and it is easily demonstrated that (T+ 1 )' > 
( 1 /+ 1 )*. 

Hemembering that a; 4- g®* by the binomial formula 

{ur^ £/-?)' Sry4.1/-* +E'-2 

Consequently 

(3'+ir>(l/+l)«5(« + l)‘ 




208 


and 

K (^. + Kv’^M0)r 

but this is the proposition we have asserted, for the extension to any number of terms 
causes no difficulty. 

But if it thus becomes a general law that the law of errors of linear functions 
must more or less approach the typical form, the same must hold good also of all mode¬ 
rately complex observations, such as those whose errors arise from a considerable number 
of sources. The expression “source of errors” is employed to indicate circumstances which 
undeniably influence the result, but which we have been obliged to pass over as unessential. 
If we imagined these circumstances transferred to the class of essential circumstances, and 
substantiated by subordinate observations, that which is now counted an observation would 
occur as a function, into which the subordinate obsq^rvations enter as independent variables; 
and as we may assume, in the case of good observations, that the influence of each single 
source of errors is smaU, this function may be regarded as linear. The approximation to 
typical form which its law of errors would thus show, if we knew the laws of errors of 
the sources of error, cannot be lost, simply because we, by passing them over as unessen¬ 
tial, must consider the sources of'error in the compound observation as unknown. More¬ 
over, we may take it for granted that, in systematically arranged observations, every such 
source of error as might dominate the rest will be the object of special investigation and, 
if necessary, will be included among the essential circumstances or removed by corrective 
calculations. The result then is that great deviations from the typical form of the law of 
errors are rare in practice. 

§ 32. It is of interest, of course, also to acquire knowledge of the laws of errors 
for the determinations of and the higher half-invariants as functions ol a given number 
of repeated observations. 

Here the method indicated in §30 must be applied. But though the symmetry 
of these functions and the identity of the laws of presumptive errors for Oj, O 5 J, ... Om 
afford very essential simplifications, still that method is too difficult. Not even for have 
1 discovered the general law of errors. In my “Almndelig lagttagekeslcerd^, Kobenhavn 
1880, I have published tables up to the eighth degree of products of the sums of powers 
Sp if,..., expressed by sums of terms of the form 0 *, 0 "*; those arc hero directly appli¬ 
cable. In W. Fiedler: der ueueren Geometrie utid der Algebra Her blnUren 

FortneH’\ Leipzig 1862, tables up to the degree will be found. Their use is more 
difficult, because they require the preliminary transformation of the Sp to the coefficients 
Uf of the rational e<iuations § 21 . There arc such tables also in the Algebra by Moyer 
Hirsch, and (Jayley has given others in the Philosophical Transactions 1857 (Vol. 147, 



209 


p. 4h0) 1 hdvo (.ouipuldl tljo toui puncjpal halt-jnv.Dwntb ot 

^ (Mi - 

»-• (in +2«i(«i-^l)^J 

(Mi“* ^ ri/w(wi-.i)-;^>(^ i- 

+ bMi^(wi —t)x; (38) 

J -- (Mi 4 (m - + 32j« [in - l)^ (m — + 

H 8«i (i« - 1)(4wi2 ^ 0<w + 6);; + 144Mi^‘ (wi - 1)-A,;j i 
+ 9 ())« (m — 1 ) (?« — 2 ) 1 ]) i 4 (m< — 1 ))], 

ller^ m 18 the numbei oi lepetihons. 

Of /ia and only the mean values and iho mean eiiors havp been found 
in^M - (M*-l)(«i-2)4j, I 

- ('»-!)■ (m-!})»;, +9/»(«.-!)()«-2)-(,i4-lj + -i;)+ ( (39) 

4 " 6 »h^ (trt 1 } (w — 2)^31 

.tnd 

«»Mi(/(,) ~ (iH-l)(w- - 6 w + 6 )>i, 
mH, {[If) - (m — 1 )' (/d-* — 6 w T 6 )^;!, + 

r hw (m -1)()«' — 6in 6) (2»»‘ — I5w + ISjJlujli 
I i%i {m -1) {III. ~ 2) {»»- 4) (»w* - 6)»+e)-!,;, 4 
I Sill (III - 1) (iVdi* - 204»»J 4- 852w’ - 1404«» 4- 828)ii| 4- 
4- Jim ()»- 1)(3)«'-38w‘4-150i»- 138)A,;;4- 
4 144«i- (w- l)(/«-2)(<H-4)(w-5);i‘;, 4- 
+ 24(./»(w- !)(/«--6w424)-i; 

Flirt III I 1 know only tlint 

(i«--1)(«i-2){(i«--12«»412)/1 -Ww/l.i,}, (41) 

h»M,(/<,) - (III -!)(«*-30IIC' 1 150itt«-240)H4120)>i,- 

- 3Qiii(iii -l}(liii^ -Hull I •Bi)^,!,- 

Mill (ill - 1 ) (ill 2 )(}im i')>lj — 

• Mm (III 1) (III-())/!;, (42) 

M%(/i,) • (III -1)(hi-2)(w*-(J0i»»4 4.%«'-72II«i430O)4, - 
- tUO«i(i«—l)(«i —2)(i« ~hi«4'^)4j^) — 

- 210(m(/m~ 1)(ill-2) (Jm>-m+GOUtJl,- 

muii(iH-l)(iH -2)(iii- 10)4jlJ. (43) 




- 56;»(m-^l)(31m*-540/n<‘ f2340»*‘-‘-^3600^+1800)^6^8 - 
^ 1680?»(w-l) (w-2)(3w8---40»* + 120w-06);6^, -- 

- 70m(w~l)(49jrt'---720wH3168i»*-~5400tM^ - 

840ws (M~ 1) (W -150^2 ^ 576„i- 540 )^^;j - 
I0080»n« (m-1) (jn~-2) (m^ - 18»i + 40);i;jl, -- 

- 840w^ (w—1) (m* — 30«i+90) . (44) 


Some +'8 of products of the /<8. and present in general the same charac¬ 
teristics as the above formulae. The most prominent of these characteristics are: 

1) It is easily explained that + is only to be found in the equation A^(/<^)^^l; 
indeed no other half-invariant than the noean value can depend on the zero of the obser¬ 
vations. In my computations this characteristic property has afforded a system of multiple 
checks of the correctness of the above results. 

2) All mean + (;:ir) are functions of the 0*^ degree with regard to m, all squares 
of mean errors X^ifir) are of the (—1)“‘ degree, and generally each ^(/ir) is a function 
of the (1—s)^*^ degree^ in perfect accordance with the law of large numbers. 

3) The factor w—1 appears universally as a necessary factor of ylj(/if), if only 
♦•>1. If r is an odd number, ev4 the factor w»—2 appears, and, likewise, if r is an' 
even number, this factor is constantly found in every term that is multiplied by one or 
more Xs with odd indices, l^o obliquity of the law of errors can occur unless at least three 
repetitions are under consideration. 

4) Many particulars indicate these functions as compounds of factorials 
(m--l)(j»—2}...(w—r) and powers of w. 

If, supposing the presumed law of errors to.be typical, we put vl^ -=-^ 4 »... ««0, 
then some further inductions can be made. In this case the law of errors of may be 


el !* 



(45) 


As to the squares of mean errors of fir we get under ihw same supposition; 

hM - 

^2 (/■*») ““ 

^*(^«) - ’ 


( 46 ) 


- h:- 


indicating that generally 



211 


This proposition is of very great interest. If we have a number w of repetitions 
at our disposal for the computation of a law of actual errors, thpn it will be seen that 
the relative mean errors of /i,, /i^ .../jr are by no means uniform, bnt incre.\8e with 
the index r. If m is large enough to give us /i, precisely and /u fairly well, then / 4 j and 
jtt 4 can be only approximately indicated; and the higjier half-invariants are only to bo 
guessed, if the ropeiiiions are not counted by thousands or millions. 

As all numerical coefficients in ^^(lUi) increase with r, almost in the same degree 
as the coofficienis 1, 2, 0, and 24 of we must presume that the law of increasing 
uncertainty of the half-invariants has a general character. 

We have hitherto been jiistitiod in speaking of the principal hall-invariants as the 
complete collection of the /ir's or Jl/s with the lowest indices, considering a complete series 
of the first m half-in variants to be necessary to an unambiguous determination of a law of 
errors for m repetitions, 

We now accept that principle as a system of relative rank of the half-invariants 
with increasing uncertainty and consequently with a decreasing importance of the half- 
invariants with higher indices. 

We need scarcely say that I hero are some special exceptions to tliis rule. For 
instance if ^ — >i;, as in alioniativc experiments with equal chances for and against 

(pitch and toss), then (/ 4 ,) is reduced to ^ Aj, which is only of the order. 

§ B3. Now we can undertake to solve the main problem of the theory of obser¬ 
vations, the transition from laws of actual errors to those of presumptive errors. Indeed 
this problem is not a mathematical one, bpt it is eminently practical. To reason from the 
actual state of a finite number of observations to the law governing infinitely numerous 
pre.sumed repcoiuons is an evident trespass; and it is a mere pttempt at prophecy to 
predict, by moans of a law of presumptive errors, the results of futqre observations. 

The struggle for life, however, compels us to consult the oracles. But the modern 
oracles must be scientific; particularly when they are asked about numbers and quantities, 
mathemaiical science does not renounce its right of criticism. We claim that confusion 
of ideas and every ambiguous u^e of words must be carefully avoided; and the necessary 
act of will musi bo restrained to the ai'ceptation of fixed principles, which must agree 
with the law of large numbers. 

It is hardly possible to propose more satisfactory principles than the following: 

The man value of all amiable repetithne can k taken dmeUy, mthoid any 
change, os an appronmatiou to the presumptive man. 

If only one observation without repetition is known, it must itself, consequently, 
be considered an approximation to the presumptive mean value. 

The solitary value of any symmetrical and univocal function ol repeated observations 



212 


must in the same way, as an isolated obsenation, be considered the presumptive mean of 
this function, for instance //r — >it ijur)- 

Thus, from the equations 37—41, we get by m repetitions: 


2i fii 

, «* 

” (»-!)(»»»-6m+6) (^* 

" {»r-1) (m - Sj (m* - 12 m+12) )' 


(«) 


as to ylg, ^8 ^ preferable to use the equations 42—44 themselves, putting only 

^(y«8) — ^ fhx and ^i(/zg) « /ig. 

Inversely, if the presumptive law of errors is known in this way, or by adoption 
of any theory or hypothesis, we predict the future ohservatiom, or functions of obsemtmSf 
primpaUii by computing their presumptive mean values. These predictions however, though 
nnivocal, are never to be considered as exact values, but only as the first and most impor¬ 
tant terms of laws of errors. 

If necessary, we complete our predictions with the mean ^ors and higher half¬ 
invariants, computed for the predicted functions of observations by the presumed law of 
errors, which itself belongs to the single observations. These supplements may often be 
useful, nay necessary, for the correct interpretation of the prediction. The ancient oracles 
did not release the questioner from thinking and from responsibility, nor do the modern 
ones; yet there is a difference in the manner. If the crossing of a desert is calculated to 
last 20 days, with a mean error of one day, then you would be very unwise, to be sure, 
if you provided for exactly 20 days; by so doing you incur as great a probability of dying 
as of living. Even with provisions for 21 days the journey is evidently dangerous. But 
if you can tarry with you provisions for 23-25 days, the undertaking may be reasonable. 
Your life must be at stake to make you set out with provisions for only 17 days or less. 

In addition to the uncertainty provided against by the presumptive law of error, 
the prediction may be vitiated by the uncertainty of the data pf the presumptive law itself. 
When this law has resulted from purely theoretical speculation, it is always impossible to 
calculate its uncertainty. It may be quite exact, or partially or absolutely false, we are 
left to choose between its admission and its rejection, as long as no trial of the prediction 
by repeated observations lias ^ven ns a corresponding law of actual errors, by which it 
can be improved on. 



m 


If the law of presumptive errors has been computed by means of a law of actual 
errors, we can, according to (37), employ the values ... and the number m of 
actual observations for the determination of In this case the complete half-invari¬ 
ants of a predicted single observation are g^ven analogously to the law of errors of the sum 
of two bondless observations hy 

^2 


Though we can in the same way compute the uncertainties of >i,, and it 
is far more difficult, or rather impossible, to make use of these results for the improvement 
of general predictions. 

Of the higher half-invariants we can very seldom, if ever, get so much as a rough 
estimate by the method of laws of actual errors. The same reasons* that cause this 
difficulty, render it a matter of less importance to obtain any precise determination. 
Therefore the general rule of the formation of good laws of presumptive errors must be: 

1 . In determining and rely almost entirely upon the actual observed values. 

2. As to the half-invai’iants with high indices, say from jig upwards, rely as 
exclusively upon theoretical considerations. 

3. Employ the indications obtainable by actual observed values for the intermediate 
half-invariants as far as possible when you have the choice between the theories in ( 2 ). 

From what is said above of the properties of the typical law of errors^ it is evident 
that no oiliei laeory can fairly rival it in the multiplicity and importance of ‘applications. 
It is not only constantly applied when ^ 4 , and jlj are proved to be very small, but it 
is used almost universally as long as the deviations are not very conspicuous. In these 
cases also great efforts will be made to reduce the observations to the typical form by 
modifying the methods or by substituting means of many observed values instead of the 
non-typical single oliservations. The preference for the typical observations is intensified 
by the difficulty of establisliing an entirely correct method of adjustment (see the following 
chapters) of observations which are not typical. 

in ilioso particular cases where or or ^ cannot be regarded as small, the 
theoretical considerations (proposition 2 above) as to A g and the higher half-invariants ought 
not io result in putting the latter =»0. As shown in “Vidmskabetnes Selskabs Overiiyte>'’\ 
1830, p, 140, sucli laws of errors correspond to divergent series or imply the existence of 
imaginary observations. The coefficients h of the functional law of errors (equation (0)) 

7 




214 


have this merit in preference to the hali-mvariants, that no term implies the existence 
of an} other. 

This series 

where ip[x]^e * 4 " (the direct expression (31) is found p, 35), is therefore recommended 
as perhaps the best general expression for non-typical laws of errors. The functional form 
of the law of errors has here, and in every prediction of future results, the advantage of 
showing directly the probabilities of the different possible values. 

The skew and other non-typM laws of errors seem to have some very interesting 
applications to biological observations, especially to the variations of species. The scientific 
treatment of such variations seems altogether to require a methodical use of the notion of 
laws of errors. Mr. K. Pearson has given a senes of skillful computations of biological and 
other similar laws of errors {Contributions to the Matk Theory of Evolution, Phil Trans. 
V. 186, p.343). Here he makes very interesting efforts to develop the refractory binomial 
functions into a basis for the treatment of skew laws of errors. But there are evidently 
no natural links between these functions and the biologic^ problems, and the above formulae 
(31) will prove to be easier and more powerful instruments. In cases of very abnormal 
or discontinuous laws of errors, more refined methods of adjustment are required. 

Example 1 . From the 500 experiments given in § 14 are to be calculated the 
presumptive haif-invariants up to ^ and by (31) the frequencies of the special events out 
of a number of — 500 new repetitions. You will find ;ii = 11-86, jig «4-1647, « 

4-708, — 3-895, and - 26*946. A comparison of the computed frequencies wjth 

the observed ones gives: 

Frequency 


Events 

computed 

observed 

o-c 

4 

0-0 

0 

- 0-0 

5 

- 0-1 

0 

+ 0-1 

6 

-0-3 

0 

+ 0-3 

7 

1-6 

3 

+ 1-4 

8 

12-3 

7 

-- 5-3 

9 

39-6 

35 

- 4-6 

10 

78-2 

101 

+ 22*8 

11 

104-1 

89 

-15-1 

12 

97-7 

94 

3-7 

13 

69-4 

70 

+ 0-6 

14 

m 

46 

+ 3*2 



315 


Frequency 


Events 

computed 

observed 

0-^ 

15 

26‘7 

30 

4* 5'3 

16 

16-0 

15 

- 1*0 

17 

8*0 

4 

4*0 

18 

3-0 

5 

+ 2*0 

19 

0*8 

1 

+ 0'2 

90 

0*2 

0 

0*2 

21 

0*0 

0 

00 


Example 2. Determine the law of errors by experiments with alternative results, 
either “yes” observed m times and every time indicated by 1, or “no” observed n times 
and indicated by 0. What is the square of the mean error for the single experiroe)\t? 

mn _^ 

** + —1)' 

for tlie probability determined by the whole series? 

and for the frequency of “yes” in the m + n experiments? 


^2 (Bi) 


mn 

w+«--r 


§ 34. If observations are made and repeated, although their presumptive mean 
value is previously known, exactly or very accurately, the law of presumptive errors of 
the halMnvariflnts /i,, //a • • * computed by reducing the zero of the observation 

to the known ip Patting thus Sj « 0 aqd — 0 in the equations (IS) and (21) we 
obtain in analogy to (38)--(41)' the following modified equations, the number of repetitions 
being mi 


/<a 


i„ 

^8 


(48) 




w—• 10, 

• ill 

m ® 


m 


is ^8' 


From the first of these equations we deduce the very important principle, that 
every mean of the squares of differences between repeated bond-free observations and their 
presumptive mean value is approximately equal to the square of the mean error 

m 


m 


r 



216 


Consequaitly, for any isolated observed value we must expect that 


(50) 


§ 35, In the following chapters, and in almost all practical applications, we shall 
work only with the typical law of errors as our constant supposition. This gives simplicity 
and clearness, and thus a^h may be recommended as a short statement of the law 
errors, Jij indicating a result of an observation found directly or indirectly by compu¬ 
tation with observations, and 5 expressing the mean error of the same result. 

By the ^^weights” of obserrations we understand numbers invwsely proportional to 
the squares of the mean errors, consequently p The idea presents itself when we 
speak of the means of various numbers of observed values which have been obtained by 
the same method, as the latter numbers here, according to (37), represent the weights 
When Vt is the weight of the partial mean value »tr, the total mean value m must be 
computed according to the formula 


-j-Oj+Vr ’ 


which is analogous to the formula for the abscissa of the centre of gravity, if Wr is the 
abscissa of any single body, Vr its wtight. We speak also of the weights of single obser¬ 
vations, according to the above definition, and particularly in cases whero we can only 
estimate the relative goodness of several observations in comparison to the trustworthiness 
of the means of various numbers of equally good observations. 

The phrase prob<xbk error, which we still find frequently employed by authors and 
observers, is for several reasons objectionable. It can be used only with typical or at any 
rate symmetrical laws of errors, and indicates then the magnitude of errors for which the 
probabilities of smaller and larger errors are both equal to ^. The simultaneous use of 
the ideas ^^mean error'" and ^'probable error" causes confusion, and it is evidently the latter 
that must be abandoned, as it is less commonly applicable, and as it can only be comput-ed 
in the cases of the typical law of errors by the previously computed mean error as 
0*6746 while on the other hand the computation of the mean error is quite 
independent of that of the probable error. As errors which are larger than the probable 
one, still frequently occur, this idea is not so well adapted as the mean error to serve as 
a limit between the frequent “staall" errors and the rarer ‘‘large" ones. The use of the 
probable error tempts us constantly to overvalue the degree of accuracy we have attained. 

More dangerous still is another confusion which now and then occurs, when the 
very expression mean error is used in the sense of the average error of the observed values 
according to their numerical values without regard to the signs. This gives no sense, 
except when we are certain of a law of typical errors, and with such a one this „mean 



2\1 


error” is |/^ L . The only reason whkh may be advanced m defence of the use of this 
idea is that we are spared some little computations, viz. some squarings and the evtraction 
of a square root, which, however, we rarely need work out with more than three signi¬ 
ficant figures. 


IX, FREE FUNCTIONS. 

§ 3G. The foregoing pi'opositions concerning the laws of errors of functions — 
especially of linear functions — form the basis of the theory of computation with observed 
values, a theory which in several important things differs from exact mathematics. The 
result, particularly, is not an exact quantity, but always a law of errors which can be 
represented by its mean value and its mean error, just like the single observation. More¬ 
over, the computation must be founded on a correct apprehension of what observations 
we may consider mutually unbound, another thing which is quite foreign to exact mathe¬ 
matics. For if is only upon the supposition that the result B = rjOi-f [ro] 

— observe Ihe abbreviated notation — is a linear function of unbound observations only, 
0 ^ ...Ofl, that we have demonstrated the rules of computation (35) 

^i(B) «=* •4* • • ■ "f (^2) 

m “ rUdo,) +.. * + rUdOn) - [rHM (53) 

While the results of computations with observed quantities, taken singly, have laws 
of errors in the same way as the observations, they also resemble the observations in the 
circumstances that there can be bonds between them, and, unfortunately, there can be 
boiids betweeii results”, even though they are derived from unbound observations. If 
only some observations have been employed in the computation of both B [r'o] and 
B* ^ these results will generally be bound to each other. This, however, does not 
proven! us from computing a law of errors, for instance for aB + hB\ We can, at any 
rate, represent the function of the results directly as a function of the unbound observations, 

*' .*’*' aK + bR' [(«•' +Jr") 4 (54) 

This possibility is of some importance for the treatment of those cases in which 
tho single obsenations are bound. They must be treated then just like results, and we 
must try to represent them as functions of the circumstances which tliej have in common, 
and wiiich must be given instead of them as original observations. This may be difficult 
to do, but as a principle it .must be possible, and fnnctions of bound observations most 
therefore always liave laws of errors as well as others; only, in general, it is not possible 
to compute these law.s of errors correctly simply by means of the laws of errors ot the 




218 


obsemt only, jutt as we cannot, in general, compute the law of errors for aB* + 
by means of the laws of errors for E and 

In example 5, § 29, we found the mean error in the determination of a direction B 
between two points, which were given by bond-free and equally good « ^j(y) — 1) 
measurements of their rectangular co-ordinates, viz.: and then, in example 6, 

we determined the angle F in a triangle whose points were determined in the same way. 
It seems an obvious conclusion then that, as F« 5 '—jB", we must have A 2 (n 

- « Yi +2^* solution IS ^s(F) » 

where and f are the sides of the triangle. The cause of this is, of course, that 
the co-ordinates of the angular point enter into both directions and bind E and E* together. 
But it is remarkable then that, when F is a right angle, the solutions are identical. 

With equally good unbound observations, Oqi 

ijCoj—2o^-)"0o) •*» 6^s(o) 

^ 2(^8 “** ^^ 2(^)1 

but 

^s(08”“3oj-|-3oj—0^) =» 20 ^ 2 ( 0 ), 

although Oj—3 oj4-3oi—O o «. (Og— 2 oj+ 0 i) — (o,— 201 - 1 - 00 ), according to which we 
should expect to find 

^f(®4—30|-f-30j —Oq) ** 4" *" l2^2(o). 

But if, on the other hand, we combine the two functions 

E « Oo4-6o,—4o, and E' « 2oi-f3o2—o^, 
where — 53 ^ 2 ( 0 ) and jl2(^')>->14Jls(o), and from this compute Xf for any function 
then, curiously enough, we get as the correct result X^iaB'’{•bB'^) «» 
(53a*-t-14b*)jl3(o) - a%(E) + b%{Br). 

Gauss's general prohibition £^st regarding results of computations — especially 
those of mean ^ors — from the same observations as analogous to unbound observations, 
has long hampered the development of the theory of observations. 

To Oppermann and, somewhat later, to Helmert is due the honour of having 
discovered that the prohibition is not absolute, but that wide exceptions mabie us to 
simplify our calculations. We must therefore study thoroughly the conditions on which 
actually existing bonds may be harmless. 

Let Op... 0 , be mutually unbound observations with known laws of errors, ijfo*), 
Xf(Oi)t of typical form. Let two general, linear functions of them be 

[po]^PlOl+...+PnOn 

[qo\ ^ 3 ,dj 4 -... 4 -^, 0 ,. 



For these then we know the laws ol errors 

-I.I 30 I “ lMo)h “ lAWl. -irfH - 0 I 
For a general function of these, Fa\po]+h\tio], the correct computation of the law of 
errors by means of F [(«]>+further give 

k^{F) =». (rt;),+^2i)^i(Ui) + -*-4-(<*|>»+^?«i)^iW *=• \ 

^F) - (01),+JJi)*4(«i)+--- + («/>.+%.)*^!('>.) - \ 

llr[F) - 0 for r>2. 

It appears then, both that the mean values can be computed unconditionally, as 
if [^)o] and [tp] were unbound observations, and that the law of errors remains typical 
Only in the square of the mean error there is a difference, as the term containing the 
factor 2ab in ^ (F) ought not to be found in the formula, if [po] and [qo] were not 
bound to one anpother. 

‘ When consequently 

[pgiji(o)] «*» ^ 

the functions f j)o] and [qo] can indeed be treated in all respects like unbound observations, 
for the law of errors for every linear function of them is found correctly determined also 
upon this supposition. We call such functions mutually ^^free functions", and for such, 
consequently, the formula for the mean error 

A,(Ma + [p]b) - m) 

holds good. 

If this formula holds good for one set of finite values of a and 5, it holds good 

for all. 

If two functions are mutually free, oach of them is said to be *^free of the other", 
and inversely. 

Example 1. The sum and difference of two equally good, unbound observations 
*ire mutually fioc. 

Example 2, When the co-ordinates of a point are observed with equal accuracy 
and without any bonds, any tiansformed rectangular co-ordinates for the same will be 
mutually free. 

Exapople 3, The sum or the mean value of equally good, unbound observations 
is fiee of every ditloronco between two of these, and generally also free of every (linear) 
function ol such diftereuccs. 





Example 4* The differences between one observation and two other arbitrary, un¬ 
bound observations cannot be mutually free, ^ 

Example 5. Linear functions of unbound observations, which are all different, are 
always free. 

Example 6. Functions with a constant proportion cannot be mutually free. 

§ 37. In accordance with what we have now seen of free functions, corresponding 
propositions must hold good also of observations which are influenced by the same circum¬ 
stances: it is not necessary to respect all connecting bonds; it is possible that actually 
bound observations may be regarded as free. The conditions on which this may be the 
case, must be sought, as in (57), by means of the mean errors caused by each circumstance 
and the coefficients by which the circumstance influences the several observations. — Note 
particularly; 

If two observations are supposed to be connected by one single circumstance which 
they have in common, such a bond must not be left out of consideration, but is to be 
respected: Likewise, if there are several bonds, each of which influences both observations 
in the same direction. 

If, on the other hand, some common circumstances influence the observations in the 
same direction, others in opposite directions, and if, moreover, one class must be supposed 
to work as forcibly as the other, the observations may possibly be free, and the danger of 
treating them as unbound is at any rate less than in the other cases. 

§ 38. Assuming that the functions of which we shall speak in the folloving are 
linear, or at any rate may be regarded as linear when expanded by Taylor’s formula, 
because the errors are so small that we may reject squares and products of the deviations 
of the obsenations from fixed values; and assuming that the observations Op ... o«, on 
which all the functions depend, are unbound, and that the values of are 

given, we can now demonstrate a series of important propositions. 

Out of the total system of all functions 

[j>0] « + ... 

of the given n observations we can arbitrarily select partial systems of functions, each 
partial system containing all those, which can be represented as functions of a number of 
m<n mutually independent functions, repr^eniative of the system, 

[ao] * 0^0^ -I- . . . -j- dfiOn 


[c?o} jUj -f* .. . -j“ dnOn , 

of which no one can be expressed as a function of the others. We can then demonstrate 
the existence of other functions which are free of every function belonging to the partial 




2B\ 


system represented liy \m\ — (^M|. It is sufficient ‘to prove that such a function 
(^ol «« +... 5 (jnOn is free of \flo\ in consequence of the equations 

'^O. For if so, [f/o] must be free of every function of the partial 


aysiem, 

because 


[(r/i zd)Q\^ /[rto] -.,. -z\{h\ 

\y(xa f... == ic\gu),\ 0. 


Any function of the total system [poj can now in one single way be resolved 
into a sum of two functions of the same observations, one of which is free of the partial 
system lepresenied by [ooj.,.[f/ol, \\lule the other belongs to this system. 

II \Ye call the free addendum [p'o], this proposition may be written 

M - [/«] + {-«N-^--^^[rf4 (59) 

By means of the conditions of freedom, 0, all that 

concerns the unknown function [p'oj can be eliminated. We find 


+ z[dal^] j 

(CO) 

«= jrfacfip] + •»• + I 

from which we determine the coefficients x.,.z unambiguously. The number m of these 
equations is equal to the number of the unknown quantities, and they must be sufficient 
for the determination of the latter, because, according to a well known proposition from 
th(» theory of doterminants, the determinant of the coefficients 


. 1 - 2 \ . 1 

is positive, being a sum of squares, and cannot be 0, unless at least one of the func¬ 
tions [ao|...[(foJ could, contrary to our supposition, be represented as a function of 
the others. 

From the values of ^... 2 thus found, we find likewise 

O'o] - IpJ - x[ao] ^ z[do], (61) 


If [po] belongs to the partial system represented by [ao].... [^io], the de¬ 
termination of X....Z expresses its coefficients in that system only, and then we get 
identically jj/o] « 0. 

8 






Z2t 


But if we take [p] out of the partial system, then (61) gives us [p'o] as different 
from zero and free of that partial system. If [go] belongs to the partial system of 
[ao]...[<io], [go] must produce in this manner the very same free {unction as [j?o]. 


Let [po]...[ro] be n—« functions, independent of one another and of the m 
functions [ao]...[io]; if we then find [p'o] out of [j)oj and [/o] out of [ro] as the free 
functional parts in respect to [flo]...[(fo], the n functions [oo]...[<io] and [i>'o].,.[r'o] 
may be the representative functions of the total system of the functions of Op..o», because 
no relaiaon a[p^o] + ... + d[r'o ]*«0 is possible; for by (61) it might result in a relation 
fl[po] + ... + d[ro]-f 3 r[oo]+...+g>[tio ]««0 in contradiction to the presumed represen¬ 
tative character of [po]... [ro] and [ao].. .*[<fo]. 

If we employ [p^o]*..[r'o] or other »—w mutually independent functions 

sdl free of the partial set [oo]... [do], as representative functions of another partial system 
of Oj...o«, then every function of this system must be free of every function of the partial 
system [no],.. [do] (Compare the introduction to this §). No other function of Oi...o» 
can be free of [oo]... [do] than those belonging to the system [po]... [ko]] otherwise we 
should have more than n independent functions of the n variables o^...o«. 


Thus selecting arbitrarily a partial system of functions of the observations o^ ...o» 
we can — with reference to gpven squares of mean errors (Oi)... (o,) — distribute 
the linear homogeneons fanctions of these observations into three divisions: 

1 ) the given partial system [oo]... [do], 

2 ) the partial system of functions [po]...[^], which are free of the former, and 

S)all the rest, of which it is proved that every such function is always in only one way 
compounded by addition of one function of the first partial system to one of the second. 


The freedom of functions is a reciprocal property, Jf'the second partial system 
yo]-*-[ko] were selected arbitrarily instead of the first [ooj,.,[do], then only this latter 
would be found as the free functions in 2 ); the composition of every function in 3 ) i^ould 
remain the same. 

Example, Determine the parts of Oj + o*, Oj + Oj + and o, + o,, which 
are free of «« u^d O 2 -I-O 4 , on the supposition that all 4 observations are equally 
exact and unbound. 


Answer: J (o, -|- e,Oj, 0 etc. 





!%S0. Like all oih<‘i funHions ol ihf* ohspnali/jn^ /yj ^a<h ftf th^sp observed 
values, for instance o,, is the sum o( two quanlilies, on^ beloncfinp to the system of 
[ao]...r<io], the other o\ to the partial svstoiu of which is free of this. But 

from Oi^dl + follows, generally, that f/V], and [ijo“] evidently belongs 

to the system of fwo"!... \d<)\ |^yo'] to the syctem which i>* free of this. Accordingly there must 
between the n functions o'...ol exist m relations \fu/] = ...[rfo'] 0; likewise w 

relations f^'] «= ...f/ro"] =* 0, between 

§ 40. That the functions of observations tan be split up, in an analogous way, 
into three or more free quantities, is of no con^etjuence for the following, except when we 
imagine this opentiion to he rmried through to the utmost. It is easy enough to see, 
however, that also the partial sj stems of functions can be split up. We could, for instance, 
among the representatives \m ],.. |c/o] of one partial system select a smaller number 
[ao],.. [6o], and from the others fco]... \(io\ according to (37), separate the functipns 
[do] ... [c?'o] which were free of fo/y]... [^o]. fc'o]... fc/'o] would then represent the subi 
system of functions, free of Iwol.-.m, within the partial system and in 

this way we may continue till all representative functions are mutually free, every single 
one of all the rest. Such a collection of representative functions u,'e call a complete set 
of free fuwtms. Their number is sufticient to enable us to express all the observations, 
and all functions of these observations, as functions of them; and their mutual freedom 
has the effect that they can be treated, by all computations of laws of errors, quite like 
unbound observations, and thus wholly replace the original observations. 

§ 41. The mathematical theory of the transformations of observations into free func- 
ti<m8 is analogous to the theory of the transformation of rectangular co-ordinates (comp. 
§ 3(), example 2), and is treated in several text-books of the higher algebra and determinants 
under the name of the theory of the orthogonal substitutions. 1 shall here enter into 
those propositions only, which we are to use in what follows. 

When we have transformed the unbound observations into the complete 

set of free functions [ao]. [^'o]...r(/^o|, it is often important to be able to undertake the 
opposite transformation back to the observations. This is very easily done, for we have 

which is demonstrated by substitution in the equations for the direct transformation 
I <f(ij •" OjOi 4* • • • 


l/o1-<0,+ . +(So., 

Iwauae |«Wj( "> ...JW'yj “ 0. 


?* 


( 62 ) 




As the original observations, considered as functions of the transformed observations 
fflo]... must be mutually free, just as well as the latter are free (hnctions of the 
former, we hnd by compuidng the squares of the mean errors jl, (o,) and the equation that 
expresses the formal condition that Oi is free of Oj^, two of the most remarkable properties 
of the orthogonal substitutions: 


1 

1,(0,) [ooi,] ‘ 

(63) 

0 _ 1 1 

[00.!,] ' ' [W;,]' 

(64) 


If bU observations and functions are stated with tbeir respective mean error as 
unity, or are divided by their mean error, a reduction which gives also a more elegant 
form to all the preceding equations, the sum of the squares of the thus reduced observations 
is not changed by any (orthogonal) transformation into a complete set of free functions. 

"We have 


J2?L-l jl ]£^ Q? . I oi 
hM ‘ ‘ hiOi) ’ 


(65) 


which, pursuant to the equations (63) and (64), is easily demonstrated by working out the 
sums of the squares in the numerators on tiie left side of the equation. As this equation 
is identical, the same proposition holds good also, for instance, of the differences between 
and » arbitrarily selected variables corresponding to them and of the 

corresponding differences between the values of the functions. Also here is 


”■ A,[(P’0] " [o«y +•••■*" [i'dn,] 

(Ol—O.)* I I (<>.-«>»)* 

4 ( 0 .) ^ h{o.) ' 


( 66 ) 


§ 42. For the poetical conpntatioii of a completo sot of free fimetions it will be 
the easiest way to bring forwu'd the ftoctioss of such a set one hy one. Is this case we 
oust ariect a snfficuent ntusber of fnsetiona and fli the order in which those are to be 
taken into considetarion. For a moment we can imagine this order to be arbitrary. 

He function [ao], which is the first in this list, is now, unchanged, taken into the 
taeformed set By multiplying the selected function by suitable constants of the form 
[oai*]’*'^^ subtracting the products fi;om the remaining functions [io] in the list we can, 
accorifflg to §38, from each of these separate tie addendum which is ftee of the selected 
function. Of these then the one which is founded on function Nr. 2 on Ihe list is taken 
into the transformed set. This function is multipM in the seme way and subtracted from 



tlje still remaining? functions, so that they give up the addenda which are free of both the 
selected functions, and so on. The following schedule shows the course of the operation 
for the case w 4. 


5s; CoettoieBts 
[ «o] 0 , a, a, a, 

[ 4. K i, 4. 

[ «ol «i «. 0, c, 

[ do] d, d, rfj d, 

^l>'o] h\ l\ h\ b\ 

[«'»] < < < o\ 

[d'o] < i, d; d', 

fd'o} < d; < c: 

[d"ol d" d': d'; d'! 

d'" d'" (27 (2'," 


Sums of the Prodncbi 

[aa2] (a62] [ ((c2] [ adX] 
[ietAl [iU] [ ic21 [ 6d>i] 
[(m2] [c 6/!] f fcxl f cdX] 
[(2a2] [(16/1] [ (2c/ij [ Mi] 

[6'6'2] [6'c'2] f b'd'i] 
[(^67] [(2c’2] [ c’(2'2] 
[(2'6'2] [d'c'l] [dtd'l] 

[c"(2’2] [c-'d-'^l 
Kc"!] [<2"(2''2] 

[(2"'r2] 


Rule of Cemputetidu 
f oo] is selected. 

f 6a;i];[ aaX] - \ b'o\ 

f co] —fao]*f wU];f aaAl == [ c'o| 

[ daX]:[ aak] =- f d^o] 

[ //o] is selected, is free of [«ol 

c'6';];[m] - fc."o] 
« [d/’o] 

[c"o] is selected, is free of f&'o] and ft/' 
[(^"o]~[c''o].[ra]:[m] - iroj 

is free of [c"d], [ft'o], and LaoJ. 


The computations of the sums of the products (in which for the sake of brevity 
we have written X for ^ (o)) could be made all through by means of the single coef¬ 
ficients in the transformed functions, as it must be done in the beginning by means 
of the coeffi( ionts in the original functions, it is much easier, however, (particularly if 
for some reason or other we might otherwise do without the computation of the coeffi¬ 
cients of the transformed functions), to make use, for this purpose, of the following remark¬ 
able property of these sums of the products. We have, for instance, 


[ 6 '( 22 ] 




[coin; 

Mj/ 


[6c;]-[flcAl*l6o>l]:[aayll. 


(67) 


Consequently, the same general rule of computation as, according to the schedule, holds 
good of the functions and their coefficients, holds good also of the sums of the products 
and of the squares. The schedule gets the following appendix: 



226 


[cW] -[«^/l]*[ca;]:fed]*ff'fi'i], \ccX]’^\ac/]icak]:[aak] = [cV^]. lcrf;]-[ad]*[cai]: [m^ «[c'rf'i] 
[rf6>l]-[aW].[rfa;]:[aaA]=[m]. [(icjl}-[ad]‘[^ail]:H] -[rf'd], frfd]~[a<^;i].[cidJ;M 


As will be seen, there is a check by means of doable computation for each of 
the sums of the products properly so called. The sums of the squares are of special 
importance as they are the squares of the mean errors of the transformed functions, 
>l,[oo]«[ad], ;i,[ 6 'o]-[m], isft'o] .*[«]. and l,[d>^]^[dV)i\ 

Examjple, Five equally good, unbound observations Oj, Oji ‘^ 8 > 04 ,and 05 represent 
values of a table with equidistant arguments. The function tabulated is known to be an 
integral algebraic one, not exceeding the 3^ degree. The transformation into free functions 
is to be carried out, in such a way that the higher differences are selected before the lower 
ones, (Because J*, certainly, f etc., possibly, represent equations of condition). With symbols 
for the differences, and with 1 , we have then: 


Fnncdoii 



Coeffidente 


1 Sitofc 

of the Pioducts 


Factors 


Oo, 

:+ 0 O 

2 +lOj 

1 +O 04 

- 0(^5 

1 

1 -^2 

3 

a 

8 

■"85 

VJo, 

0 

0 

-1 

1 

0 


i 3 


-10 

I 

J*Oj 

0 

1 

~2 

1 

0 

*" «» ' 

i 6 

- 10 - 

■20 

f 


0 

-1 

3 

^3 

1 

3 

3-10 

20 

35 

-i 

J‘o, 

1 

-4 

6 

-4 

1 

, 6~10~20 

1 

35 

70 

is selected 


8 

85 

18 

J5 

IT 

35 

12 

85 

8 

'"85 

17 

1 85 

f -? 

0 


0 






f 

-f < 

f 1 

-1 


i 


f 

-1 




-f . 

f f 

0 


0 


-i 

1 

0 

-1 

i 

0 

1 0 

{ 


is selected 

Oi-nA 

8 

”“86 

12 

86 

IT 

35 

12 

85 

JL 

85 

17 

86 

f -f 



1 

rj9,TiFi»9,_Xj<0, 

i 

85 

6 

iii 

-1 

1 

85 

18 

J5 1 

1 -1 -L : 

, 7 85 



-i 


1 

-i 


-i 

i 

-t . : 

f ? 



1 is selected 


i 

i 

i 

i 

i 

i < 

) 



are free 

FjO| —^ 1 VJhf — 

i-i 

1 

'^10 

0 

1 

To 

i 

0 T 

1 

0 



arebothsdected. 



227 


The complete set of free observations and the squares of their mean errors 
are thus: 

( 0 ) « + + ^4+^5)» ^ 2 {^) ** i 

(1) - VAo,^}J\+UV^\^U%) ^ ;ai) - A 

(2) -» => ^(2oi-02-20j,-o^+2o 6), ;.2(2) -» f 

(3) «=. Fj^oa-^j^o, « J(^0i-f202-. 20 , 4 . 05 ), X,(^) - ? 

(4) » J ^03 » Oj—4o24150a — 4o,4^fi» ^2(4) *** 70 

Through this and the preceding chapter we have got a basis which will generally 
be sufficient for computations with observations and, in a wider sense, for computations 
with numerical values which are not given in exact form, but only by their laws of errors. 

We can, in the first place, compute the law of errors for a given, linear function of reci- 

proc^ly free observations whose laws of presumptive errors we know. By this we can 
solve all problems in which there is not given a greater number of observations, and other 
more or less exact data, than of the reciprocally independent unknown values of the 
problem. When we, in such cases, by the means of the exact mathematics, have expressed 
each of the unknown numbers as a function of the given observations, and when we have 

succeeded in bringing these functions into a linear form, then we can, by (35), compute 

the laws of errors for each of the unknown numbers. 

Such a solution of a problem may be looked upon as a transformation, by which 
n observed or in other ways given values are transformed into % functions, each corre¬ 
sponding to its particular value among the independent, unknown values of the problem. 
It lies often near thus to look upon the solution of a problem as a transformation, when 
the solution of the problem is not the end but only the means of determining other un¬ 
known quantities, perhaps many other, which are all explicit functions of the independent 
unknowns of the problem. Thus, for instance, we compute the 6 elements of the orbit of 
a planet by the rectascensions and declinations corresponding to 3 times, not precisely as 
our end, but in order thereby to be able to compute ephemerides of the future places of 
the planet. But while the validity of this view is absolute in exact mathematics, it 
is only limited when we want to determine the presumptive laws of errors qf sought 
functions by the given laws of errors for the observations. Only the mean values, sought 
as well as given, can be treated just as exact quantities, and with these the general linear 
transformation of n given into n sought numbers, with altogether n* arbitrary constants, 
remains valid, as also the employment of the found mean numbers as independent variables 
in'the mean value of the explicit functions. 

If we want also correctly to determine the mean errors, we may employ no other 
transformation than that into free functions. And if, to some extent, we may choose the 



22d 


independent unknowns of the problem as we please, we may often succeed in carrying 
through the treatment of a problem by transformation into free function^; for an unknown 
number may be chosen quite arbitrarily in all its n coefficients, and each of the following 
unknowns looses, as a function of the observations, only an arbitrary coefficient in com¬ 
parison to the preceding one; even the # unknown can still get an arbitrary factor. 
Altogether are Jtt(n+1) of the n’ coefficients of these transformations arbitrary. 

But if the problem does not admit of any solution through a transformation into 

free functions, the mean errors for the several unknowns, no matter how many there 

may be, can be computed only in such a way that each of the sought numbers are directly 

expressed as a linear function of the observations. The same holds good also when thp 

laws of errors of the observations are not typical, and we are to examine how it is with 
^3 and the higher half-invariants in the laws of errors of the sought functions. 

Still greater importance, nay a privileged position, as the only legitimate proceeding, 
gets the transformation into a complete set of free functions in the over-determined problems, 
which are rejected as self-contradictory in exact mathematics. When we have a collection 
of observations whose number is, greater than the number of the independent unknowns 
of the problem, then the question will be to determine laws of actual errors from the 
standpoint of the observations. We must mediate between the observations that contradict 
one another, in order to determine their mean numbers, and the discrepancies themselves 
must be employed to determine their mean deviations, etc. But as we have not to do with 
repetitions, the discrepancies conceal themselves behind the changes of the circumstances 
and require transformations fqr their detection. All the functions of the observations 
which, as the problem is over-determined, have theoretically necessary values, as, for 
instance, the sum of the angles of a plane triangle, must be selected for special use. 
Besides, those of the unknowns of the problem, to the determination of which the theory 
does not contribute, must come forth by the transformation by which the problem is to 
be solved. 

As we shall see in the following chapters on Adjustment, it becomes of essential 
moment here that we transform into a system of free functions. The transformation begins 
with mutually free observations, and must not itself introduce any bond, because the trans¬ 
formed functions in various ways must come forth as observations which determine laws 
of actual errors. 


X ADJUSTMENT. 

§ 43. Pursujng the plan indicated in §5 we now proceed to treat the determina¬ 
tion of laws of errors in some of the cases of observations made under varying or different 



229 


essential qiroumstanees. But here we must be content with very small results. The 
general problem will hardly ever be solved. The necessary equations must be taken &om 
the totality of the hypotheses or theories which express all the terms of each law of error 
— say their half-invariants — as functions of the varying or wholly different circumstances 
of the observations. Without great regret, however, the multiplicity of these theoretical 
equations can be reduced considerably, if we suppose all the laws of errors to be exclusively 
of the typical form. 

For each observation we need then only two theoretical equations, one representing 
its presumptive mean value the other the square of its mean error (o{), as func¬ 
tions of the essential circumstances. But the theoretical equations will generally contain 
other unknown quantities, the arbitrary constants of the theory, and these must be elimi¬ 
nated or determined together with the laws of errors. The complexity is still great enough 
to require a further reduction. 

We must, preHmmarily at all events, suppose the mean errors to be given directly 
by theory, or at least their mutual ratios, the weights. If not, the problems require a 
solution by the indirect proceeding. Hypothetical assumptions concerning the are 
used in the first approximation and checked and corrected by special operations which, as 
fkr as possible, we shall try to expose beside the several solutions, using for brevity the 
word '^criticism" for these and other operations connected lyith them. 

But even if we confine our theoretical equations to the presumptive means 
and the arbitrary unknown quantities of the theory, the solutions will only be possible if 
we further suppose the theoretical equations to be linear or reducible to this form. 
Moreover, it will generally be necessary to regard as exactly given many quantities really 
found by obseuatlon, on the supposition only that the corresponding mean en'ors will be 
small enough to render such irregularity inoffensive. 

In the solution of such problems we must rely on the found propositions about 
functions of observations with exactly given coefficients. In the theoretical equations of 
each problem sets of such functions will present themselves, some functions appearing as 
given, others as required. The observations, as independent variables of these functions, 
are, no)v the given observed values now the presumptive means 1 ,( 0 /); the latter are, 
for instance, among the unknown quantities required for the exact satisfaction of the 
theoretical equations. 

What is said here provisionally about the problems that will be treated in the 
following, can be illustrated by the simplest case (discussed above) of n repetitions of the 
same observation, resulting in the observed values o„ ... o,. If we here write the theo¬ 
retical equations without introducing any unnecessary unknown quantities, they will show 
the forms 0 li(o/) — lj(o*) or, generally, 0lj[a(o/-^ot)]. But these equations are 

9 



^50 


evidently not sniticient for the dewrmiDatio** of any which they only give if another 
is found beforehand. The sought common mean cannot be formed by the introduc¬ 
tion of the observed values into any function [a(o*-^oj;)], these erroneous values of the 
functions being useful only to check >1^ (o») by our criticism. But we must remember what 
we know about free functions: that the whole system of these functions [a(o<—o*)] is only 
a partial system* with «—1 differences o*—o* as representatives. The only functions 
which can be free of this partial system, must evidently be proportional to the sum 

‘ and by this we find the sought determination by 

the presumptive mean being equal to the actual mean of the observed values. 

If we thus consider a general senes of unbound observations, ... Ony It is of 
the greatest importance to notice first that two sorts of special cases may occur, in which 
our problem may be solved immediately. It may be that the theoretical equations concern¬ 
ing the observations leave some of the obsenations, for instance quite untouched; it 
may be also tiiat the theory My determines certain others of the observations, for 
instance o«. 

In the former case, that is when none of all the theories in any way concern the 
observation Oj, it is evident that the observed value Oj must be approved unconditionally, 
Even though this observation does not represent any mean value found by repetitions, but 
stands quite isolated, it must be accepted as the mean ^(Oi) in its law of presumptive 
errors, and the corresponding square of the mean error must then be taken, 
unchanged, from the assumed investigations of the method of observation. 

If, in the latter case, is an observation which directly ^nncems a quantity that 
can be determined theoretically (for instance the sum of the angi:^ of a rectilinear triangle), 
then it is, as such, quite superfluous as long aa the theory is maintained, and then it 
must in all further computations be replaced by the theoretically given value; and in 
the same way ijfo,) must be replaced by ^ero, as the square of the mean error on the 
errorless theoietical value. 

The only possible meaning of such superfluous observations must be to test the 
correctness of the tneory for approbation or rejection (a third result is impossible when 
we are dealing with any real theory or hypothesis), or to be used in the criticism. 

In such a test it most be assumed that the theoretical value corresponding to 
which we will call If,, is identical with the mean value in the law of presumptive errors 
for 0 ,, consequently, that if,»>^(o,), and the condition of an affirmative result must be 
obtained from the square of the deviation, (o,—if,)* in comparison with ^^(o,). The 



231 


equation (o« —need not be exactly satisfied, but the approximation must 
at any rate be so close that we may expect to find k^{on) coming out as the mean of 
numerous observed values of (o„ — «»)*. Compare § 34. 

§ 44. If then all the observations ...o. fall under one or the other of these two 
cases, the matter is simple enough. But generally the observations o, will be connected 
by theoretical equations of condition which, separately, are insufiScient for the determination 
of the single ones. Then the question is whether wo can transform the series of observatiops 
in such a way that a clear separation between the two opposite relations to the theory 
can be made, so that some of the transformed functions of the observations, which must 
be mutually free in order to bo treated as unbound observations, become quite independent 
of the theory, while the rest are entirely dependent on it. This can be done, and the 
computation with observations in consequence of these principles, is what we mean by 
the word “adjustmenf. 

For as every theory can be fully expressed by a certain number, w-w, of theoretical 
equations which give the exact values of the same number of mutually independent linear 
functions, and as we are able, as we have seen, from every observation or linear function 
of the observations, in one single way, to separate a function which is free and independent 
of these just named theoretically given functions, and which must thus enter into another 
system, represented by m functions, ibis system must include all those functions of the ob¬ 
servations which are independent of iho theory and cannot be determined by it. Ea(*h of the 
thus mutually separated systems can be imagined to be represented, the theoretical system by 
the non-theoretical or empirical system by m mutually free functions, which together 
represent all observations and all linear functions of the same, and which may be looked 
upon as a complete, transformed system of free functions, consequently as unbound obser-^ 
vations. The two systems can be separated in a single^ way only, although the represen¬ 
tation of each partial system, by free functions, can occur in many ways. 

It is the idea of the adjiutimenU by means of this transformation, to give the 
theory its due and the observations theirs, in such a way that every function of the theo¬ 
retical system, and particularly iho fr(?e representatives of tho same, are exchanged, 
each with its theoretically given value, which, pursuant to tho theory, is free of error. On 
the other hand, every function of the empiric system and, particularly, its ») free representa¬ 
tives remain unchanged as the observations determine them. Every general function of 
the n observations \do] and, particularly, the observations themselves are during the adjust- 
nienl. split into two univocally determined addenda: the theoretical function (rf'o], which 
should have a fixed value D\ and the non-thqoretical one \d"o]- The former [d'o] is by 
the adjustment changed into D' and made errorless, the latter is not changed at all The 
result of ibe adjustment, i)'-f [(foj, is called the adjusted value of the function, and may 



25 ^ 


be iodicated as [du], the adjusted valie> ui the observations tbeinselves being written 
The forms of the functions are not broken^ as the distributive principle f{x^y) 
* bolds good of ever; homogeneous linear function. 

The detemunaiaon of the a4jiisted values is analogous to the formation of the 
mean values of laws of errors by repetitions. For theoreticall; determined functions the 
adjusted value is the mean value on the very law of presumptive errors; for the functions 
that are free of the whole theory, we have the extreme opposite limiting case, mean values 
represented by an isolated, single observation. !n general the adjusted values [du] are ana¬ 
logous to actual mean values by a more or less numerous series of repetitions. For while 
we have X,[du] - consequently 

sm^er than The ratio is analogous to the number of the repetitions or 

the weight of the mean value. 

§ 4&* By we mean the trial of the — hypothetical or theoretical — 

suppositione, which have been made in the adjustment, with respect to the mean errors of 
the observations; new determinations of the mean errors, analogous to the determinations 
by the square of the mean deviations, will, eventually also Ml under this. The basis of 
the criticism must be taken from a comparison of the observed and the adjusted values, 
for instance the differences [do]—[dtf]. According to the principle of §34 we must expect 
the square of such a difference, on an average, to agree with the square of the correspon¬ 


ding mean error, i,([da]-[dn]), but as j;do]-[du]- [d'o]-/)', and jljd'o] « 
iJdo]—ijdu], we get 

( 68 ) 


which, b; wa; of parenthoBUi shows that hbs obsenred and the adjusted values of the sune 
iiuotioB or observation cannot in gsnetal be mutnally free. We onght then to have 



on the avenga; and for a sum of terms of this form ws must expect the mean to approach 
the nnmber of the terme, note bene, if there are no bonds between the fhnotione 
but in general ench bonds will be present, produced by the a(l|ustinent or by the seleo- 
tion of the functions. 


It is no help if we select the original and unbound obaeivations themselvee, and 
consequentiy form sums such as 

f («-”)• ] 

ur(«)-i.w]’ 


fbt sAsr the adjustment and ita change of the mean errors, o,... ti, are not gonstally 
free frmotione suoh ai 0 |...o,. Only one single choice » immediately sab, via., to aUek 
to the system of tbs matnally free fiinctions which, in the adjustnsnt, bare themsolvea 



255 


represented the observations: the w-w theorelKallly given functions and the m which 
the adjustment determines by the observations. Only of these we know that they are free 
both before and after the adjustment. And as the differences of the last-mentioned m 
functions identically vanish, the criticism must be based upon thew terms corresponding 
to the theoretically free functions [ao] 4* . [b'o] « B of the series 


(|(H ([6'ol~^)* 

^8 H - ^ ‘ Wo] - ki Wu] 




([6'ol-i}')' 


. (70) 


the sum of which must bo expected to be»n— 

Of course we must not expect this equation to be strictly satisfied; according to 
the second equation (46) the square of the mean error on 1, as the expected valuj of each 
term of the series, ought to be put down = 2; for the whold series, consequently, we can 
put down the expected value as w- w±l/2(a—w). 

But now wo can make use of the proposition (66) concerning the free functions. 
It oilers us the advantage that we can base the criticism on the deviations of the several 
observations from their adjusted values, the latter, we know, being such a special set of 
values as may be compared to the observations like ...u» loc. cit.; Wj ...w» are only 
distinguished from by giving the functions which are free of the theory the same 

values as Ihe observations. Wo have consequently 


’Ms] [mj 


(o-uY 


^.(0) 


it-m±Vn-m. (71) 


If we compare the sum on the right side in this expression with the above men- 


tituied 


{o^ur 
h{o) - hi«) 


at the smaller value « —m only, while 


, which we dare not approve on account of the bonds produced by 
'1 ffo —«)* 

adjustment, then there is no decided contradiction between putting down ' 

(o-u)* _ 

A W (*0, 

iiators, can get the value n; onlv iie can gel no certainty for it. 

The ratios between the coiresponding terms in these two sums of squares, conse- 


1 ,( 0 ) 

, by the diminution of the denomi- 


quontly 




.1 




, we call “scales”, viz. scales for measuring the influence 


of ihe adjustment on the single observation. More generally we call 


1 — scale for the function [rfo]. (72) 

If the scale for a function or observation has its greatest possible value, viz. 1, 
The theory has then entirely decided the result of the adjustment. But if 
the scale sinks to its lowest limit we get just the reverse i. e. the 

theory has had no influence at all; the whole determination is based on the accidental 



«4 


Tslae of the obserTatioD, and for obeerrations in thia case we 




;,(o)-d,(«) 0 

Even thoogb the scale has a finiiOi but Tory small value it will be inadmissible to de¬ 
pend on the value of such a term becoming ■=■ !• We understand now, therefore, the 


superiority of the sum of the squares 


A(») 


(e -K)» 1 
- iiMl 


jo-n)' 




»— >» to the sum of the squaros 


M as a bearer of the summary criticism. 


We may also very well, on principle, sharpen the demand for adjustment on the 


4{<>y 


must 


part of the criticiani, so that not only the whole sum of the squares 

approach the value n—but also partial sums, extracted from the same, or even its 
several terms, must ^proach certain valpes. Only, they are not to be added up as 
numbers of units, but must be sums of the scales of the corresponding terms. 8o much 


we may iarust to the sum of the squares 


(o-w)* 


ciously applied,* may be considered as fully 
1(0 


W. 

ustiiied. 


that this principle, when judi- 


The sum of the squares 


hio) 


possesses an interesting property which all 


other authors have used as the basis of the adjustment, under the name of method of 
the least squares'*. The above sum of the squares gets by the adjustment the least possible 


value that 


^i(0) 


can get for values o, ...n,, which satisfy the conditions ef the theory. 


The proposition (66) concerning the free functions shows that the condition of this minimum 
is that [c"o] — ** [/<"'«] for ail the free functions which are determined by 

the observations, consequently just by putting for each v the corresponding adjusted 
value u. 


§ 46. The carrying out of adjustments depends of course to a high degree on the 
form in which the theory is given.. The theoretical equations will generally include some 
observations and, beside these, some unknown quantities, elements, in s^^aller number than 
those of the equations, which we just want to determine through the adjustment. Tbis 
gen^ form, however, is unpractical, and may also easily be transformed through the 
usual mathematical processes of elimination. We always go back to one or the other of 
two extreme forms which it is easy to han^e: either^ we assume that all the elements 
are eliminated, so that the theory is given as above assumed by linear equations 
of condition with theoretically given coefficients and values, adjusttmU 6y correlates; or, 
we manage to get an equation for each observation, consequently no equations of condition 
between several observations. This is easily attained by making the number of the elements 
as large (-««») as may be necessary: we may for instance give some valnes of observations 
the name of elements. This sort of adjustment is called adjustmmt hf elements. We 



233 


shall discuss these two forms in the following chapters X! and XII, first the adjustment by 
correlates whoso rules it is easiesii to deduce. In practice we prefer adjustment by correlates 
when m is nearly as largo as w, adjustment by elements when m is small. 


XI. ADJUSTMENT BY CORliELATBS. 


S 47. We suppose we ha\e ascertained that the whole theory is expressed in the 
equations [ai^J m* X,... [c»]« 0, where the adjusted values u of the n observations 
are the only unknown quantities; we prefer in doubtful cases to have too many equations 
rather than too few, and occasionallj a supernumerary equation to check the computation. 
The first thing the adjustment by correlates then requires is that the functions [ao]... 
corresponding to these equations, are made free of one another by the schedule in 42. 

Let [oo],... [c"oJ indicate the wt mutually free functions which we have got 
by this operation, and let us, beside tlicse, imagine the system of free functions completed 
by w other arbitrarily selected functions, ... representatives of the empiric 
functions; the adjustment is then principally made by introducing the theoretical values 
into this system of free functions. It is finally accomplished by transforming back from 
the free modified functions to the adjusted observations. For this inverse transformation, 
according to (62), the u equations are; 





d[ 


[ro\ ^... i 




and according to (35) (compare also (G3)) 


L(o,) > 


I «’ 


■ [ < 

# 


.4 




4 


dr 






( 74 ) 


As the ndjustmoni influonci^ only the H’-m first terms of each of these e((uation.s, 
\\c )»aM‘, bftcaiisc |«//| .1,... |/•”w| == T", and [c"«j ^ 0, 




and 

;,(4c) 


( 

\\oaki 






<r s’. 1 


I -r* 
lir.ri,j 


+••• + 


/ I 




(75) 


(76) 




296 


and 


jfl. j 


(77) 

(78) 


Thus for the computation of all the differences between the observed and adjusted values 
of the several observations and the squares of their mean errors, and thereby indirectly for 
the whole adjustment, we need but use the values and the mean errors of the several 
observations, the coefhcients in the theoretically given functions, and the two values ot 
each of these, namely, the theoretical value, and the value which the observations would 
give them. 

The factors in the expression for o,—w,, 


[ao]^A „ lo"oj~ 

. 


which are common to all the observations, are called comlates^ and have given the method 
its name. The adjusted, improved values of the observations are computed in the easiest 
way by the formula 

«, = 0. - ;i, (».){«,A'. . f (79) 


By writing the equation (78) 






^2 


(80) 


and summing up for all values of i from 1 to we demonstrate the proposition concerning 
^he sum of the scales discussed in the preceding chapter, viz. 



I .W] 




(81) 


§ 48. It deserves to be noticed that all these equations are homogeneous with 
respect to the symbol >ij. Therefore it makes no change at all in the results of the 
adjustment or the computation of the scales, if our assumed knowledge of the mean errors 
in the several obsemtions has failed by a wrong estimate of the unity of the mean errgve 
if only the proportionality is preserved; we can adjust correctly if we know only t - 
relative wrights of the observations. The homogeneousneas is not broken till we reach the 
equations of the criticism; 





^37 


« K\\im),\ h- . I 


{[c^'o\-.(y 

m\ 




{n^uf 
. ^*W". 


[(aKu + • • * + c"AV')*i2(^)] = ” db )/Mn^m) 


It loHov^s that criticism in this foim, the “bunjinary criticism”, can only be used to try 
the correctness of the hypothetical unity of the mean errors, or to determine this if it 
has originally been quite unknown. The special criticism, on the other hand, can, where 
the series of observations is divided into groups, give fuller information through the sums 
of squares 



taken for each group. We may, for instance, test or determine the unities of the mean 
errors for one group by means of observations of angles, for another by measurements of 
distances, etc. 

The criticism has also other means at its disposal. Thus the differences (o — u) 
ought to be small particularly those whose mean errors have been small, and they ought 
to change their signs in such a way that approximately 


= 0 (84) 

for natural or accidentally selected groups, especially for such series of observations as are 
nearly repetitions, the essential circumstances having varied very little. 

If, ullimately, the observations can be arranged systematically, either according to 
essential circumslances or to such as are considered inessential we must expect frequent 
and irregular changes of the signs of o—i/. If not, we are to suspect the observations of 
systematical errors, the theory proving to be insufficient. 

§ 40. It will not be superfluous to present in the form of a schedule of the 
adjustment by correlates what has been said here, also as to the working out of the free 
functions. We suppose then that, among 4 unbound observations Oj, 03 , and o^^ with 
the squares on their mean errors and there exist relations 

which can be expressed by the three theoretical equations 

[a«] ^ 

[dw] w Wj “b Wj -j-fcjtfs "1“ ^4^4 

[cu] 3 . + - C. 

The schedule is then as follows; 

10 



238 


The given 


Frs' iu:«wU^;ab 


A B ( 

B' 

C' 

C” 


«, ^,(0,) a, 

K 

< 

< 

u 

0, ^,(a.) «, 3, 0, 


< 

< 


0, ^,(«,) a. K a. 


a-. 


0,-.w, «, 

0. t,(c,) 0 . i, f, 


a'. 


u 

M LH H 
[aot] [ait] [act] 

[i'c] 

[a'a] 

[d'o] 


[Jai] [bit] [4«i] 

[WtJ [iVlj 



[cat] [cM] [ect] 

[dVl] [cVtJ 

[Mj 


, [tot] [cat]! 

^“(SI]’ ’■'“[iSfl 


[dh'i] 

m] 


1 

1 

[80]-i 

arwhtt. 

s._ 

Vo]-B 

[m] 

[cVt] 1 


^i(«i) 1 - 

^(»,-«.) ^.K) 1 - 

^iW 1 - 


Scales 

■^,(«,):^,(»i) 


■■ 3 as proof, 


Criticism 

(o,-«,)«:^(o,) 

K 

Sum for proof 


Tho fre« functions are compnied by means of: 


B' - B-^A 

I'l • if—Buf 

J’o] - po]-;S[ao] 

C = G-rA 
c! =. 

[do] - [co]-r[oo] 

0" - O'-^B 
cj'-c;-/!! 

m ” H-r'[i'aj 

[e-W] = [cifl-^M 

Ic'c'^] »o [cd]—^[ca>l] 

[d'd'l] «[c'c't]-rTa'^t] 


Bj the adjustment properly so called ve compute 


Oi—Ut =» (oiKf^-biKi 4 " 



^ Ai(C() — i|(Oi—Ui), 


lifooii,]+1}, [m,]+s:i.[c'w,] 




3 i ve¬ 


in order to get a cheidE ve ought farther to compute [an]» it, [in] «> B, ana 
[ck]—C, with the ralnes ve hare found forand «,. MoreoTor- it is useAil to 
add a snpetflnons theoretioal equation, for instance [(a-|-b'f()«]«it-(-j5'f6', through the 







233 


computation of the free functions, which is correct only if such a superfiuiiy leads to 
identical results. 

§ 50. It is a deficiency in the adjustment by correlates that it cannot well be 
ensployed as an intermediate link in a compulation that goes beyond it. The method is 
good as far as the determination of the adjusted values of the several observations and 
the criticism on the same, but no farther. We are often in want of the adjusted values 
with determinations of the mean errors oi certain functions of the observations; in order 
to solve such problems the adjustment by correlates must be made in a modified form. 
The simplest course is, 1 think, immediately after drawing up the theoretical equations of 
condition to annex the whole series of the functions that are to be examined, for instance 
[tfo],... [eo], and include them in the computation of the free functions. In doing so we 
must take care not to mix up the theoretically and the empirically determined functions, 
80 that the order of the operation must unconditionally give the precedence to the 
theoretical functions; the others are not made free till the treatment of these is quite 
finished. The functions ... [e^oj, which are separated from these — it js scarcely 
necessary to mention it remain unchanged by the adjustment both in value and in 
mean error. And at last the adjusted functions [du ]^... [au], by retrograde transformation, 
are determined as linear functions of i, F, C‘\ 

Example 1. In a plane triangle each angle has been measured several times, all 
measurements being made accoiding to the same method, bondfree and with the same 
(unknown) mean error: 

for angle A has been found 70° 0' 5" as the mean number of 6 measurements 

« « J5 » 1 » 50°0'3" « . « • «10 

» • C « « p 60°0'2" « » p p «15 

The adjusted values for t)io angles are then 70°, 50°, and 60°, the mean error for mngle 
measurement « V'SOO 17"3, the scales 0*5, 0'3, and 0‘2. 

Example 2. (Comp, example § 42.) Five equidistant tabular values, 12, Id, 29, 
41, 55, have been obtained by taking approximate round values from an exact table, from 
which reason their mean errors are all >= )/X. The adjustment is performed under the 
successive hypotheses that the table belongs to a function of the 2^, and !■* degree, 
and the hypothesis of the second degree is varied by the special hypothesis that the 2*^ 
difference is exactly 2, in the following schedule marked (or). The same eehedule may 
be used foi all four modiheations of the problem, so that in the sums to the right in the 
schedule, the first term corresponds to the first modification only, and the sum of tb« 
two first terms to the second modification: 


10* 



240 


0 


j* 

rj» 

i* 

{Vff 


0- 

-M 




0 

0 

0 (or2) 

0 

0 (or2) 





12 

A 

1 

0 

0 

-i 



1+ 7+180 (or+20)), 

Ho( 1+ 7+20) 

19 

i 

ts 

-4 

-1 

1 

1 - 


'hi~ 

4-U- 

80 (or-10)), 

a-.(16+28+ 5) 

29 

I 

T5 

6 

3 

-2 

0 - 

-f 

u 

6 + 0— 

160 (or-20)), 

jf,(38+ 0+20) 

41 

i 

11 

-.4 

-3 

1 

-1 - 


h(~ 

4+14- 

80 (or-10)), 

^(16+28+ 5) 

55 

i 

It 

1 

1 

0 

i 

1 

Jii 

1- 7+160 (or+20)), 

i},{ 1+ 7+20) 



1 

0 

2 

-i 

18 

T 







?o 

s& 

so 









n 

It 

""is 









It 

to 

It 

10 
-It 

n 

0 







to 

10 

8 

0 

1 







""it 

""it 

IS 


«s 











For the summary criticism: 



^JC. 

TO 



.=8(orl) 


f(o-«)>l 

i’Wo) 1 

- 

35^35+ 

7680 (or 120) 

35 


The hyjpoHiesis of Hhe third degree^ J<«0i where the values of 70tf< and their 
differences are: 

839 1334 2024 2874‘ 3849 

495 690 850 975 

195 160 125 

-35 -35, 

agrees too well with the observations, and must he suspected of being underadjusted, for 
the sum of the squares of the summary criticism is only 

where we might expect l±kf. 

The hj^hesis of the second degree^ « 0, gives for 70m, and 

differences: » 

832 1348 2024 2860 3856 

516 676 836 996 

160 160 160. 

The adjustment is here good, the sum of the squ,ires is 

and we might expect 2±F4. 

The hgpothem of the first degree, 4^ « 0, 0, J*« 0, gives for the adjusted 

values and their differences: 


H m 31-2 42-0 52-8 
10'8 108 10-8 10*8. 




24 * 


The deviations are evidently too large (o-k is +2*4, -1-4, -2*2, -1*0, +2*2) 
to bo due to the use of round numbers; the sum of the squares is also 
220*8 instead of 3 ±1^6, 
consequently, no doubt, an over-adjustment. 

The special adjustment of the second degree, J*« 0, P2I* 0, and J*»» 2, gives 
for Ui and its differences; 

11*6 19*4 29*2 41*0 54*8 
7*8 9*8 11*8 13*8 

The deviations o-m « .0*4, -0*4, -0*2; 0*0, 4-0*2 

nowhere reach and may consequently be due to the use of round numbers; the sum oi 
the squares 

4*8 instead of 3 ±1/6 

al30 agrees very well. Indeed, a constant subtraction of 0*04 from ut would lead to 
(3*4)*, (4*4)*, (5*4)*, (6*4)*, and (7*4)*, from which the example is taken. 

Example 3. Between 4 points on a straight line the 6 distances 

®i4 

Ufa, Oj4 
O34 

are measured with equal exactness without bonds. By adjustment we find for instance 

^is « i(0ig-0s8) + i(o44-o,4); 

we notice that every scale It is recommended actually to work the example by a 
millimeter srsle, which is displaced after the measurement of each distance in order to 
a\oi(i bonds. 


Xn. ADJUSTMENT BY ELEMENTS. 

§ 51. Though every problem* in adjustment piay be solved in both ways, by 
correlates as well as by elements, the difficulty in so doing is often very difterent. The 
most Sequent cases, where the number of equations of condition is large, are best suited 
for adjustment by elements, and this is therefore employed far oftener than adjustment 
by correlates. 

The adjustment by elements requires the theory in such a form that each obsertKh 
iion is r^resented ly one eguaiion which expresses the mean value ( 0 ) eiplicitely as 
linear functions of unknown values, the ^^elements'', x, y, ... z: 



( 86 ) 


2AZ 


+ • -j-fiZ ffi 


hM ^ p»^+9*y+ ■•' + 


where the p, 3 , ... r are-fheoretically given. All observations s\i.5>pose4 to be 
unbocnd. 

The problem is then first to determine the adjusted values of these elements 
a;, ... z, after which each of these equations (85), which we (?all ^'e<]uatm^ for the 

oh8ervationi\ gives the adjusted value u of the observation. 

Constantly assuming that j|,(o) is known for each observation, we can from the 
system (85) deduce the following normal eguations: 


'pUo)'\ 


£L 

'JL 

JM. 


j;-|. 


m 


n ■ 




y + • • • "h 

!f+ ••• + 


l)r 


.m 

>*' 

m 


p 
1 p 


( 86 ) 



the rule of formation being apparent from the left band terms. Of these normal equations 
we can prove, first that they, m in number, are suited for the determination of the m 
elements, so far as these, on the whole, can be determined by the equations (85), and 
then that the functions of the observations, which form their lefb hand terms are free of 
all the theoretical conditions of the problem, so that, as indicated by the last sign of 
equality in the normal equations, they can and must be determined by the directly 
observed values 0 ^ ... o„. 

For if we assume, as to the first proposition, that any of the normal equations 
can be deduced from the others, so that all the elements cannot be determined by these 
equations, then there must be m coefficients h^hy ... f, so that 


pp 

"x 

+ i 

IL 

+ . 

• 

rp 

T 

H 


N 

X 

+ . 

■ +1 

rq 

X 

pr] 

,T; 

4^k 

qr 

+ • 


n 

7, 






243 


everywhere used for Jlj(o)); but if we multiply these again respectively by I and 

add, wo get 

+m . 

r 

that IS 

hpx + kfit+ ..4./n«o, 

so that not only the normal equations, but the very equations for the observations can, 
consequently, all be written with m—\ or a smaller number of elements. 

But further, the system of functions represented by the normal equations is free 
of every one of the conditions of the theory. The latter we can get by eliminating the 
elements x, y, ... z from the equations of the observations (85). But elimination of an 
element, say for instance x, leads to the functions ^.^ 1 ( 0 ^)-pjtAi(Of), and among the 
linear functions of these must be found the functions from which not only x but all the 
other elements are eliminated, and consoquenil} the conditional equations of the tlieoiy. 
But it is easily seen that the functions 

and 

are mutually fice. The latter is the left hand side of the normal equation which is parti* 
cularly aimed at the element a;; it is formed by multiplying the equations (85) by the 

coefficient of a; in each, and has the sum of the squares ^ as the coefheient of this 

eloment; it has thus been proved to be free of all the conditions of the theory, and must 
therefore in the adjustment be computed by the directly observed values, for which reason 

we have been able in the equations (86) to rewrite the function as . In the same 

way we prove that all the other normal equations are free of the theory, each through 
the elimination from (85) of its particularly prominent element. While, in the adjustment 
by correlates, we e\('lusi\ely made use of the equations and funitions of the theory, wc 
put all these aside in the adjustment by elements, in order to work only with the empiri¬ 
cally determined functions which the normal equations represent. 

The coefilcienis of the elements in the normal equations are, as it will be seen, 
arranged in a remarkably symmetrical manner, and each of them has a significance for 
the problem which it is easy to state. 

The coefficients in the diagonal line, which are respectively multiplied by the 
clement to which the equation particularly refers, are as sums of squares all po.sitive, and 
each of them is the square of the mean error for that function of the observations in 
whose equation it occurs. We have for instance 






244 


The coefhcients outside the diagonal line are identical in pairs, the coefficient of 


^ in y's particular equation, is the same as the coefficient of y, 
• , run! 

equation. They show immediately if some of the functions 

to be mutually free; if for instance function 


we must have 




i’l aj’s particular 
should happen 


is to be free of y’s function 


§ 52. If now the elements have been selected in such a convenient way that all 
these sums of the products vanish, and the normal equations consequently appear in the 
special form 


PP 

X 


X 


2 ? 


y 


P? 

X 



(87) 



then they ofer us directly the solution of the problem of adjustment. The adjusted values 
for the elements are 


and the squares of the mean enors 

4(*) - [j] \ ^i{y) ■“ [f] *> ••• 


ro 


rr' 

.a. 


T. 


rr 

-1 


J 

» 


( 88 ) 

(89) 


and from these we can then compute both the adjusted value and its Xi for every linear 
function of the elements, because these are mutually free functions. In particular from 
the equations (85), 


we can compute the adjusted values of ihe observations, then from (35) the squares of 
the mean errors Xf (t<^), and also the law of errors for every function of observations and 
elements. 


§ 53. In ordinary cases a transformation of the system of elements is required. 
It is required for the solution of the normal equations in order to find the values of the 
elements; but we must remember that we have here a double problem, as it is also our 
object to firee the transformed elements so that they may be used for determinations of 
the mean errors. The transformation therefore cannot be selected so arbitrarily as in 
analogous problems of pure mathematics; yet there is a multiplicity of possibilities, and 




145 “ 

> 

in wanj special cases radical changes can lead to \ery beautiful solutions (see J02). The 
first thing, however, is to secure a method which may be always applied; and this must 
be selected in such a way that the elements are eliminated one by one, so that the later 
computation of tliem is prepared, and moreover, constantly, in such a way that freedom 
is attained. 

This can, if we commence for instance by eliminating the element x, be attained 
in th^e following way. The normal equation which particularly refers to x, 








(!H)) 


and which will be put aside to be used later on for the computation of x, is multiplied 


by such factors, viz. (p 


qp 

. 

PP 

/ii 



PP 




! ^ • • t W 1 





, tliat X vanishes when the 


products are respectively subtracted from the other normal equations; but it roust be 
remembered that we are not allowed to multiply the latter by any factor. The equation 
for X can then be written 


a)z 

where . 

The functions in the other equations 


po 


PP 





(91) 


qo 


po 


'{q-fp)0 


ro' 


Ipo] 

1 \{r—a)p)6 


^9 

.A, 



, .... 

7, 

— Q) 

[t] 

1 1 ^ J 

r--i 


become, by this means, not only independent of x but also free of 

> 0, etc. 


or of f, for 




The equations which in a double sense have been freed from m, get exactly the 
same characteristic functional form as the normal equations had. If we write 


so that the equations for the observations become 

j>(f+s,'y+ 

we not only get, as we see at once. 


{(6 


[22] 

mom th 

po 

.|*U 

ro 

— oi 

po 

T. 



r 

.7 

’ 1^1 



.A. 


a 



^46 


but also 


111 


If] 


iP 

, .... 

Yl! 

. k . 


fE 

^(0 

'^p 

k. 

1^ 

- 

rq 

X 


rp 

, .... 

Vf^l 

. ^ J 


>r‘ 

.7 

— (0 

rp' 

.k. 


(94) 


Hence we proceed exactly in the same way from this first stage of the transformation of 
the normal equations 


y+ ••■ + 


y+ ••• + 


(95) 


using, for instance, the first of them for the elimination of the element y. If 


y is replaced by 


f^2'l 


, >i / 

’ J , 




'q/o] 

\i£ 


' j, 


which is free of the element f, and for which we have 


(96) 





(97) 


By means of a*' and corresponding coefficients we have, analogously to (93) and (94), 




Vo' 


■?'o 


[r"r" 


Vr' 


f>^jl 

, k . 



— (Ji 

k^ 

1 

, ^ 1 


, k . 

— Ui 

, X , 


which an independent of any special computation of the coefficients r'\ 

Continuing in this way, till we have obtained a set consisting only of free func¬ 
tions, we find, consequently, just a system of elements, f, :y, C* which possess the above^ 
mentioned desired property, its normal equations being ot the same form as (87), viz.: 


PI 

A 



po' 


k. 


q'o 



} 


(98) 


r - 

r^o 

. kl " 

,k. 






a47 


With these elcmonts the cqualions for the adjusled values of the several observations 
become 

••• ( 1 ) 0 ) 

and for the squares of their mean errors 


+ 7f 








(lOfll 


If we want to ooinpiite adjusted values and mean errors for the original elements or funo- 
tions of the same, the means of so doing is given by the equations of transformation 


* + ?'?+■•• +ro2 =s. f 

y -t" • • "I" = ij 




( 101 ) 


or by (90), the first equation (95) and the last of (98), being idential vrith (101). For not 
only the original elements y, ... z are easily eomputcd by these, but also the coef¬ 
ficients in the inverse transformation 


**=« + “?+•• +rC I 

y“ ,4....+df (1(12) 

a- f. ) 

Now, if F* is a given linear funelion of a,y, ... a, then by obvious numerical 
operations we get an expression for it, 

and for the square of its mean error we gel 





+ •••+<«’ 


ixl 


If for special criticism we want the computation of for many observations, 
we may take advantage of transforming the equations of observations, computing their 
coefficients by or 

••• 

r" ^ r[-'(D'q[\ 


but wo remember that q \.. r" are quite superfluous for the coefficients of (95). 


§ 54. In the theory ol the adjustment by elements we must not overlook the 
proposition concerning the computation of the minimum sum of squares for the beneflt of 
iho atunmury (‘riticism as well as for checking our computation. We are able to compute 
, which is to approach the value « — as soon as we have found* 

Vloinonls, wilhottt being obliged to know the adjusted values for the separate 

a* 


liho sum 
only iho 




^48 


observations. And this computation can be performed, not only for the legitimate adjust¬ 
ment, but for any values whatever of the elements. It is easiest to show this for trans¬ 
formed elements, . . Cr values for the observations corresponding to these 

must be computed by (99) 

From this we get 


'(o—tj)*] foo 

l;,(o) J “ [^j 

-2 

po 

k 

f ,-2 

yo' 

i^J 

7i- •• 

.-2 

r^o' 
k . 

fi + 


+ 

ip 

7, 

f .‘ + 



• + ^ 

. ^ \ 



If we here substitue for ^ ^ *T elements f, 

5 , ... C» of the legitimate adjustment, we find from the equations (98) 


1 + 7 T ' +[—]((Cx-O*'-n-(104) 


It is evident from this that the condition of minimum is fj -»f, =*3?i Ci *** C* The 

minimum sum of squares is therefore obtained only by the determination of the functions 
tlu^ are free of the theory, by means of their directly observed values. And for this 
minimum 



It deserves to be noticed that the middle one of these expressions holds good, in unchanged 
form, also of the original, not transformed elements and coefficients. We have 



which is easily proved by substituting in (106) the values obtained from (101). The 
aquation is particularly valuable as a check on the accuracy of our computation. 


§ 55. In going through the theory of adjustment by elements here developed, it 
will be seen that a very essential part of the work, viz. the computation of the trans- 



249 ^ 


formod values of the coefficients in the equations for the several observations, may nearly 
always be dispensed with. The sums of the squares, [^], and the sums of the products, 


I jj, must be transformed; but they are in themselves sufficient for the determination of 

the transformations, and by their help we find values and mean errors for the elements, 
first the transformed ones, but indirectly also the original ones. The adjusted values 
of the observations can, consequently, also be computed without any knowledgej 
of Only for the computation of ... consequently for a 

special criticism, we cannot escape the often considerable work which is necessary for 
the purpose. 

For the summary criticism by “ »—jm we can even, as 

we have seen, dispense with the after-computation of the several observations by means 
of the elements. We ought, however, to restrict the work of adjustment so far only, when 
the case is either very difficult or of slight importance, for this minimum sum of squares 
is generally computed much more sharply, and always with much greater certainty, directly 
by On and ^j{o), than by the formulse (105), (106), and (107). 

Add to this, that the special criticism does not exclusively rest on I, (^) the 

scales 1 — but that the very deviations o,-w„ when they are arranged according 

XjlOj 

to the more or less essential circumstances of the observations, are even a main point in 
the criticism. Systematical errors, especially inaccuracies or defects in hypotheses and 
theories, will betray themselves in the surest and easiest way by the progression of the 
errors; regular variation in o-m as a function of some circumstance, or mere absence of 
frequent changes oi signs, will disclose errors which might remain hidden by the check 

according to progression in the errors may, we 

know, oven be used to indicate how we ought to try to improve the defective theory. 


§ 56. By series of adjustment (compare Dr. J. P. Gram, Udjevningsrakker, Kjeben- 
havn 1879, and Crelle’s Journal vol. 94), i. e. where the theory gives the observations in 
the form of a series with an indeterminate (infinite) number of terms, each term being 
multiplied by an unknown factor, an element, and where consequently adjustment by 
elements must be employed, the criticism gets the special task of indicating how many 
(or which) terms of the series we are to include in the a^ustment. Formula (107) fur¬ 
nishes us with the means of doing this. 


. ^2(0) 



w—w J;|/2(n“m). 



250 

For the « terms in the series, which is here indicated by 2, correspond, each of them, to 
an element, consequently to one of the terms of the series of adjustment. For each term 
we take into this, the right side of the equation of criticism is diminished by about a 
unity; the result of the criticism, consequently, becomes more favourable if we leave out 

all the terms for which -j- * — <1* If we retain any terms which essentially fall 
under this rule, the adjustment becomes an under-adjustment; if, on the other hand, we 
leave out terms for which — * ~ we make ourselves guilty of an over- 

adjustment. 

Example 1. The five-place logarithms in a table are looked upon as mutually 
unbound observations for which the mean enor is constantly V'^ of the fifth decimal 
place. The “observations”, log 795, log 796, log 797, log 798, log 799, log 800, log 801, 
log 802, log 803, log 804, and log 805, are to be adjusted as an integral function of the 
second degree 

log (800 zt^‘ 

In order to reckon with small integral numbers, we subtract before the adjustment 
2'90309 4* 0*00054 f, both from the observations and from the formulse. Taking 0*00001 as 
our unity, we have then the equations for the observations: 

— 2 — a—5y-l-252f 

— 2 »» ® — 4y-l-16» 

— 1 — a —8y+ 9« 

— 1 ■■ a: —2y-l- 

0 — a; —iy-|- iz 
0 -1 a 

0 — a + ly+ 

0 »■» a-|-2y-!- 4af 

1 1. a + 3y+ 

1 a-f4y-|- 16a 

1 »» a + 5y + 25a. 

From this we get y| *-156, and the normal equations: 

- 36 - 132a+ 0y+ 1320a 

420 - 0a-fl320y+ Oa 

-540 - 1320a-t- 0y +23496a. 

The element y is consequently immediately free of a and a, but the latter must be made 



?5t 


free of one another, which is done by multiplying the first equation by 10 and subtracting 
it from the third. The transformation into free functions then only requires f— 
substituted foi x, and we have: 


-36^ 132f, 

420 « 132Qy, 

-180 = 102962, 

consequently, 

f -0’2727, 1, (f) « 1: 132 • 390; 51480 « *007676 

y ^ 0*3182, >lj{y) «. 1: 1320 « 39 : 51480 « *000758 

2 • - 0*0175, 1, (2) «« 1:10296 - 5 : 51480 « *000097. 


The mean error of y is consequently J;; 0*0275, and that of z ±0*0099. The element x 
is found by 2 — f —102 — —0*0977, to which corresponds > 1 ,( 2 ) l,(f) + 100jli(2) 

«-0*0173 — (0*1315)*. For log 800 we find thus 2*9030890 ± 0*0000013, and the 
corresponding difference of the table is 54*318 ± 0*028. 

For the sum of the squares of the deviations we have, according to (105)—(107), 


, ^ 0 ) 


156 - 9*82-133*64 - 3*15 -«9*39, 


which shows that the term of the second degree contributes somewhat to the goodness of 
the adjustment. This sum of squares ought, according to the number of the observations 
and the elements, to be 11—3»»8, with a mean uncertainty of ±4. 

The best formula for computing the adjusted values of the several observations 
and their mean errors is Wi — f+yi+ 2 (i*—10), which gives: 


u 

0 —w 



^(u) Seale 

log 796 - 2*9003688 

+ ‘12 

•0144 

390+ 39 -25+ 5 .225 2490 

•0484 

•419 

log 796 - 2*9009136 

-•36 

•1296 

?90 f 39.16 + 5 - 36 - 1194 

•0232 

•722 

log 797 - 2*9014580 

+ •20 

•0400 

390 + 39. 9 + 5 . 1 - 746 

•0145 

•826 

log 798 - 2*9020019 

-•19 

•0361 

390 + 39. 4 + 5 . 36 - 726 

•0141 

•831 

log 799 - 2*9025457 

+ •43 

•1849 

390 + 39 . 1 + 5- 81 - 834 

•0162 

•806 

log 800 - 2*9030890 

+ •10 

•0100 

390 + 39 - 0 + 5-100 - 890 

•0173 

*792 

log 801 - 2*9036321 

-•21 

•0441 

390 + 39- 1 + 5 . 81 - 834 

•0162 

•806 

log 802 » 2*9041747 

-*47 

•2209 

390 + 39- 4 + 5 . 36 - 726 

*0141 

•831 

log 803 - 2*9047170 

+ *30 

•0900 

390 + 39- 9 + 5 . 1 - 746 

•0145 

•826 

log 804 - 2*9052590 

+ •10 

•0100 

390 + 39-16 + 5 - 36 - 1194 

•0232 

*722 

log 805 - 2*9058006 

-•06 

•0036 

390 + 39-25 + 5 - 225 - 2490 

•0484 

*419 



•7836 

12870 


8-000 



252 


Botli the checks agree: the sum of squares is 12 x 0*7836 9*40, and the sum 

of the scales is 11—3. 

It ought to be noticed that the adjustment gives very accurate results throughout 
the greater part of the interval, with the exception of the beginning and the end. The 
exactness, however, is not greatest in the middle, but near the !•* and the 3"* quarter. 

Example 2. A finite, periodic function of one single essential circumstance, an angle F, 
is supposed to be the object of observation. The theory, consequently, has the form; 

0, « Co4cicosF+SiSinF+Co cos2r+S2sin2F+ ... 

We assume that there are n unbound, equally exact observations for a series of values of V, 
whose difference is constant and — for instance for ]"«.0,G0°, 120°, 180°, 240°, 300°. 

ft 

Show that the normal equations are here originally free, and that they admit of an exceedingly 
simple computation of each isolated term of the periodic series. 

Example 3. Determine the abscissae for 4 points on a straight line whose mutual 
distances are measured equally exactly, and are unbound. (Cmp. Adjustment by Correlaies, 
Example 3, and § 60). 

Example 4, Three unbound observations must, according to theory, depend on two 
elements, so that 

Oj « - 1 

Of « xy, LJo,) « J 

Os * <^ 2 ( 03 ) — 1 . 

The theory, therefore, does not give us equations of the linear form. This may be produced 

in several ways, most simply by the common method of presupposing approximate values 
of both elements, the known a for x and b for y, and considering the corrections f and; ij 
to be the dements of the adjustment. We therefore put a; =* a + f, and y «« i + jy. 
fiejecting terms of the 2 ®“ degree, we get the equations of tlie observations: 

Oj —a® = 2a$ 
o^-^ab 

Oj—5® s= 2bi^ , 

where the middle quation has still double weight. The normal equations are: 
2a(o,-o®) + 25(Os-ah) « (4a®-f 2^»*)f-f 2o6:y 

2fl(Oj — *{-26(Oj—b®) =« 2 a& f-j-{46®-j-2 a®) jy j 


( is consequently not free of but we find 


2ax = Oj+a* — 


2o5ot+a®08) 

(o*+i*)® 




a*+2h* 
4(a® + ft®)* 


% 08 + 6* — 


a* ( 6 * 0 , — 2fl6oj + a® 08 ) 
(a*+6*)* 




2fl*+6* 

4(a*-t-6*)*‘ 



253 


For the adjusted value of the middle observation we have 

- ai*«, +(a‘+4% + (.%., ;.(«.) - i 

If we had transformed the elements (comp. § 62) by putting 

f = af-Jti 

^ 

or 

«« a(l+0-5o 

y - + 

we should have obtained free normal equations 

2(a*Oi+2o5o, + 6*03)-2(aH^*)* * 4(a«+b*)»C 
2l-aio, + (a^^b^)o^+abCs} - 2 (a* + &*)*«* 

If we had placed absolute confidence in the adjusting principle of the sum of 
squares as a minimum, a solution might have been founded on 

K ” «*)* + 2(0, -05)* + (03 - 5*)* « min. 

The Auditions of minimum are: 

T ^ “ ( 0 ,-««)«+(«,-ai)4 - 0 

T T + “ ®- 

The solution with respect to a and 5 is not very difficult. We see for instance 
immediately that 

(0i--o*)(03^5*) « (o,^o5)* 

or 

OjOg— 0 * — 5*0,—2fl5o, + 0 * 05 . 

Still better is it to introduce s* » a*-f-5*, by which the equations become 
(Oj—s*)o4*o,5 -• 0 

cons^uently, 

*‘-**k-i-«»)+V8-<'1 -0 

If the errora in 0 ,, 0 ,, and o, are not large, 0 , 03 — 0 * mast be small; one of the 
two values of s*mast then be small, the other nearly equal to Oj-f Og; only the latter cau 
be used* 

IS 



2S4 


Further, we get: 


0* 


'^T ^ 0,—s* 0, 

/a\* 

W vT* 

0i+<»a-2^*' 0i+08--SJe*‘ 


In this way we avoid guessing at approximate values (for which otherwise we 
should perhaps have taken and The values which we have here found 

for a* and and to which may be added 


, o,s* 

are really exact; and if we substitute them in the above normal equations, we get f 0 
and * 0. 

Even when, as in this case, the theory is not linear, it is not unusual for the 
sum of the squares to be a minimum. Caution, however, is necessary; particularly, it 
may happen that the sum of the squares becomes a maximum for the found elements, or 
for some of them. 

We may also in another way make the equations of this example linear, namely, 
by considering the logarithms of o^, o^, as the obsened quantities, and finding the 
logarithms of the elements from the equations which will then be linear. 

logoj « 2logx 
logo, « iogaj + logy 
logo, — 2 logy. 

In this way we throw the difficulty over upon the squares of the mean errors. As 


hg{z+dz) - log 2 ? +j , 

we may approximately take 


If a and ^ also here indicate approximate values of z and y. the weights of the 
3 eqnations, respectively, become proportional to 0 ^ 2a’6>, and b\ Thus we find the 
normal eqnations 

2a*logOi-f 2a*&*logo, — (4a*+2o*6*)loga: + 2o*fi*logy 

log 0, + log 0, — 2o*6* log a? + ( 46 <+ 2 o»fi*) log y, 



2S5 


which give the simple results 
2 log a; 0 = logoj — 


KM 


21ogt/ = logos 


7 




isqr^j 


^2 

-Ma 

/i2 




a* 4 - 26 * 

2a^+b^ 

4>(a*+>)*' 


This solution agrees only approximately with the preceding one. It might seem 
for a moment that, in this way, we might do without the supposition of approximate values 
for the elements, but this is far from being the case. For the sake of the weights we 
must, with the same care, demand that a and a;, as also b and y, agree, and we must 
repeat the adjustment till the squares of the mean errors get the theoretically correct 
values. And then it is only a necessary, but not a sufhoient condition, that x—a and 
y — 5 are small. Unless the exactness of the observations is also so great that the mean 
errors of 0 , are small in proportion to 0 { itself, the laws of errors of the logarithms cannot 
be considered typical at the same time as those of the observations themselves* 

Example 5. The co-ordinates of four points in a circle are observed with equal 
mean errors and without bonds: ajj «= 20, yi = 10; ojs =« 16» =» 18; jJj 3, ys««17; 

and y^*=^L In the adjustment for the co-ordinates a and b of the centre and 

the radius r, we cannot use the common form of the equations 

because it embraces more than one observed quantity besides the elements. In order to 
obtain the separation of the observations necessary for adjustment by elements, we must 
add a supplementary element, or parameter, F, for each point, writing for instance 
a;, = fl -|- r cos F,, y, = 6 + sin F,. 

As the equations are not linear we must work by successive corrections A^i Ab, 
Ar, AF, of the elements, of which the first approximate system can be obtained by 
ordinary computation from 3 points. For the theoretical corrections Axi and Ayt of the 
co-ordinates we get by differentiation of the above equations 

AiCi «= Ao-(- Ar«co8 F< — A F,‘r sinF, 

Ay, =» Ai +Ar*sinF<-1-AF,.rcosF,. 

a 

These equations for the observations lead us to a system of seven normal equations. 
By the ^'method of partial elimination" (§ 61) these are not difficult to solve, but here the 
simplicity of the problem makes it possible for us immediately to discover the artifice. 
We know that every transformation of equally well observed rectangular co-ordinates results 
in free functions. The radial and the tangential corrections 

AflJiCOsF,-!-AyiSinF, « An, 
and 

Aa?,sin F< —A y, cos Fi Af, 


12* 



256 


can, consequently, here be taken directly for the mean values of corrections of observed 
quantities, and as only the four equations 

AasinFj—AAcosF, —r AF, 

contain the four corrections A Ff of the parameters, they can be legitimately reserved for 
the successive corrections of the elements. In this way 

An» « AacosFi-f A68inF, +Ar 

with equal mean enors, Xi{n) » Jl,(x) « are the ^'equations for the observations*' 
of this adjustment, and give the three normal equations : 

[AncosF] « Ao[cos* F] + A J[cos Fsin F] + Ai"[cos F] 

[AnsinF] « Aa[co8FsinF] +H +Ar[smF] 

[A«] « Aa[cosF] +Ai[sinF] -|-Ar*4. 

In the special case under consideration, we easily see that the iirst, second, and 
fourth point lie on the circle with r 10, whose centre has the co-ordinates a 10 and 
the parameters are consequently; 

Fi - O'^ffO, F, - 53®r8, Fa « 135“0’0, and F,« 216“52'2. 

For the third point the computed co-ordinates are: aj8-=^2*929() and yj—17*0710, 
consequently, Aa;, =» -fO'OTlO and Ays =“ —0*0710, A^g = 0, and Ang =» —0*1005? 
an other differences AiSt^O and Ay(*-»0.' The ^'equations for the observations" are: 

1*0000Aa +0*0000AHl’OOOOAr « *0*0000 
0*6000 A a +0*8000 A 6-1-1*0000 A r - 0*0000 

-0*7071 A a 4- 0*7071 A 6 + 1*0000 A r - - 0*1005 
-0*8000 A a - 0*6000 A 6 +1*0000 A r « 0*0000. 

The normal equations are: 

2*5000 A a +0*4600 A 6-h 0*0929 A r -« 4-0*0710 

0*4600 A a + 1*5000 A 6 4- 0*9071 A r --0*0710 

B « 0*0929 A 0 4-0*9071 A 6 4-4*0000 A r - -0*1005. 

By elimination of A r we get 

2*4978 A a -f- 0*4390 A 6 - + 0*0733 

B - 0*4390Aa+l’2943A6 - -0*0482; 

and by eliminating A 6 

A - 4-2*3490 A a « 4-0*0896. 

From jS, B, and A we compute 

Aa*.+0*0381, A6--0*0501, and Ar«-0*01465. 

The checks are found by substitution of these in the several equations. The 4 equations 




257 


for the observations give the following adjusted values of 

Anj +0*0234, A»a « -0*0319, » -0*0770, and Aw< « -00151: 

the sum of squares (here = (8—7)>l2) is consequently 

- (0*0234)* + (0*0319)* + (0*0235)* + (0*0151)* 0*00235, 

For this, by the equation (108), we get 

0*01010 - 0*00271 - 0*00356 - 0*00147 - 0*00236 
as the final check of the adjustment. 

The 4 equations for A tt give us 

AV, - +17'2, AK, - +20'8, AF 3 « -2'9, and AF, - -21U 
Thus, by addition of the found corrections to the approximate values, 

9*98535, a- 10*0381, 5- 9*9499, 

V, « 0nr2, V, - 53‘’28'6, Fj - 134"571, anfi V, - 216°3(y6, 

we have the whole system of elements for the next approximation, if they are not the 
definitive values. In both cases we must compute by them the adjusted values of the co¬ 
ordinates, according to the exact fomuloe; the resulting differences, obs.—comp., are: 

Point Ax Ay An At 

1 -0*0232 +0*0002 - 0*0232 + 0*0002 

2 + 0*0191 +0*0257 + 0*0320 0*0000 

3 + 0*0166 - 0*0166 - 0*0234 - 0*0001 

4 -0*0123 - 0*0090 + 00152 0*0000. 

The sum of the squares, [(Aaj)*+(Ay)*]«0*00236, agrees with the above value, 
which indicates that the approximation of this first hypothesis may have been sufficient. 
Indeed, the students who will try the next approximation by means of our final differences, 
will, in this case, find only small corrections. 

From the equations A, B, and B, which express the free elements by the original 
bound elements, A a, A 5, A r, we easily compute the equations for the inverse trans¬ 
formation: 

A a- 0*4257..4 

- -0*1444.i +0*7726.5 
Ar « 0*0228.i-0*1752.,5 + 0 * 25 .iJ. 

By these, any function of the elements for a given parameter can be expressed as a linear 
function of the free functions i, J?, and E\ and by >lj(i) = 2*349012, 1,(3) —1*2943+, 



and the mean error is easily found. Thus the squares of the mean errors of 

the co-ordinates a; and y are 

-{2'3490( 0-4257+0-0228cosF)‘4-1-2943(-0-1752cosF)* + 4(0-25 cos F)»)4 

-l.(y) = {2-8490(-0-1444 +0-0228sinr)'+l-2943( 0-7726 -0-1752siiiF)’4-4(0-25smF)*}4 

Only the value /Ij 0*00236, found by the summary criticism, is here very 
uncertain. 


Xffl. SPECIAL AUXILIAJKY METHODS. 

§ 57. We have often occasion to use the method of least squares, particularly 
adjustment by elements; and this sometimes requires so much work that we must try to 
shorcen it as much as possible, even by means which are not quite lawful. Several temp¬ 
tations lie near enough to tempt the many who are soon tired by a somewhat lengthened 
computation, but not so much by looking for subtleties and short cuts. And as, moreover, 
the method was formerly considered the best solution — among other more or less good — 
not the only one that was justided under the given supposition, it is no wonder that it 
has come to be used in many modifications which must be regarded as unsafe or wrong. 
After what we have seen of the* difference between free and bound functions, it will be 
understood that the consequences of transgressions against the method of least squares 
stand out much more clearly in the mean errors of the results than in their adjusted 
values. And as —to some extent justly more importance is attached to getting tolerably 
correct values computed for the elements, than to getting a correct idea of the uncertainty, 
the lax morals with respect to adjustments have taken the form of an assertion to the 
effect that we can, within this domain, do almost as we like, without any great harm, 
especially if we take care that a sum of squares, either the correct one or another, becomes a 
minimum. This, of course, is wrong. In a text-book we should do more harm than good 
by stating all the artifices which even expoaenced computers have allowed themselves to 
employ, under special circumstances and in face of particularly great difficulties. Only 
a few auxiliary methods will be mentioned here, which are either quite correct or nearly 
so, when simple caution is observed. 

§ 58. When methodic adjustment was first employed, large numbers of figures 
were used in the computations (logarithms with 7 decimal places), and people often com¬ 
plained of the great labour this caused; but it was regarded as an unavoidable evil, when 
the elements were to be determined with tolerable exactness. We can very often manage, 
however, to get on by means of a much simpler apparatus, if we do not seek something 



sss 

which cannot be determined. During the adjustment properly so called, we ought to be 
able to work with three figures. But this ideal presupposes that two conditions are satis¬ 
fied: the elements we seek must be small and free of one another, or nearly so; and in 
both respects it can be difficult enough to protect oneself in time by appropriate trans¬ 
formation. Often it is only through the adjustment itself that we learn to know the 
artifices which would have made the work easy. This applies particularly to the mutual 
freedom of the elements. The condition of their smallness is satisfied, if we everywhere use 
the same preparatory computation as is necessary when the theory is not of linear form. 

By such means as are used in the exact mathematics, or by a provisional, more 
or less allowable a(^ustment, we get, corresponding to the several observations o, ..o«, 
a set of values which are computed by means of the values Xq...Zq of the 

several elements a... a;, and which, while they satisfy all the conditions of the theory with 
perfect or at any rate considerable exactness, nowhere show any great deviation from the 
corresponding observed value. It is then these deviations and a;—.. which are 
made the object of the adjustment, instead of the observations and elements themselves 
with which, we know, they have mean error in common. When in a non-linear theory 
the equations between the adjusted observation and the elements are of the general form 

Ui - z), 

they are changed into 

bv means of the terms of the first degree in Taylor's series, or by some other method oi 
approximation. If the equations are linear 

we have, without any change, for the deviations: 

w. -(a;-ajo) + • •' + n(«-«o)* (^1^) 

No special luck is necessary to find sets of values, 
tions 0 ,-D, show only two significant figures; and then computation by 3 figures is, as 
far as that goes, sufficient for the needs of the adjustment. 

The method certainly requires a considerable extra-work in the preparatory com¬ 
putation, and it must not be overlooked that computations with an exaettfess of many 
decimal places will often be necessary in this part; especially ought to be computed with 
the utmost care as a function of lest any uncertainty in this computation should 

increase the mean errors, so that we dare not put ^j(o—t^) = Aj(o). 

This additional work, however, is not quite wasted, even when the theory is linear. 
The list of the deviations o, ~t>, will, by easy estimates, graphic construction, or diiectly 



260 


by the eye, with tolerable certainty lead to the discovery of gross errors in the series of 
observations, slips of the pen, etc., which must not be allowed to get into the adjust¬ 
ment. The preliminary rejection of such observations may save a whole adjustment; the 
ultimate rejection, however, falls under the criticism after the adjustment. 

In computing the adjusted values, particularly Ui, after the solution of the normal 
equations, we ought not to rely too confidently on the transformation of the equations into 
linear form or into equations of deviations for Oi—Vt, Where it is possible, the actual 
equations ut » F(x, ,..z) ought to be employed, and with the same degree of accuracy 
as in the computation of In this way only can we see whether the approximate system 
of elements and values has been so near to the final result as to justify the rejection of 
the higher terms in Taylor's series. If not, the adjustment may only be regarded as 
provisional, and must be repeated until the values of i^, got by direct computation, 
agree with the values through u,—in the linear equations of adjustment. 

On the whole the adjustment ought to be repeated frequently till we get a sufficient 
approximation. This, for instance, is the rule where the observations represent probabilities, 
for which Xf (o») is generally known only as functions of the unknown quantities which 
the adjustment itself is to give us. 

§ 59. The form of the theory, and in particular the selection of its system of 
elements, is as a rule determined by purely mathematical considerations as to the 
elegance of the formula, and only exceptionally by that freedom between the elements 
which is wanted for the adjustment On the other hand it will generally be impossible 
to arrange the adjustment in such a way that the free elements with which it ends, can 
all be of direct, theoretical interest. A middle course, however, is always desirable, for the 
reasons mentioned in the foregoing paragraph, and very frequently it is also possible, if 
only the theory pays so much respect to the adjustments that it avoids setting up, in the 
same system, elements between which we may expect beforehand that strong bonds will 
exist Thus, in systems of elements of the orbits of planets, the length of the nodes and 
the distance of the perihelion from the node ought not both to be introduced as elements; 
for a positive change in the former will, in consequence of the frequent, small angles of 
inclination, nearly always entail an almost equally large negative change in the latter. If 
a theory says that the observation is a linear function of a single parameter, the formula 
ought not to be written unless all the /'s are small, some positive, and others 

negative, but where is an average of the parameters corresponding to 

the observations. If we succeed, in this way, in avoiding all strongly operating bonds, 
and this can be known by the coefficients of all the normal equations outside the diagonal 
line becoming numerically small in comparison with the mean proportional between the 
two corresponding coefficients in the diagonal line, then we have at any rate attained so 



B6I 


much that we need not use in the calculations for the adjustment many more decimal 
places than about the 3, which will always be sufficient when the elements are originally 
mutually free, and not during the adjustment are first to be transformed into freedom 
with painful accuracy in the transformation operations, 

If, by careful selection of the elements, we even get so far that no sum of the 
products in numerical value exceeds about ^ of the mean proportional between the 
corresponding sums of squares fe], or in many cases only ^ of these amounts, 
then we may consider the bonds between the elements insignificant. The normal equations 
themselves may then be used to determine the law of error for the elements; we compute 
provisionally a first approximation by putting all the small sums of products«« 0, and in 
the second appioximation we correct the [jjoJ’s by substituting the sums of the products 
and the values of the elements as found in the first approximation. For instance: 


o 

( 

1 

1 

= tel»i 

(111) 



(112) 

..(w-a-. 

-Ml- 

(113) 


As the errors in these determinations are of the second order, it will not, if the o's 
themselves are small deviations from a provisional computation, be necessary to make any 
further approximations. 

Even if the bonds between the elements, which are stated in terms of the sums 
of the products, are stronger, we can sometimes get them untied without any transforma¬ 
tion. If we can get new observations, which are just such functions of the elements that 
the sums of the products will vanish if they are also taken into consideration, we will of 
course put off the adjustment until, by introducing them into it, we cannot only facilitate 
the computation but also increase the theoretical value and clearness of the result. And 
if we can attain freedom of the elements by rejecting from a long series of observations 
some single ones, we do not hesitate to use this means; especially as such unused observa¬ 
tions may very well be employed in the criticism. If, for instance, an arctic expedition 
has made meteorological observations at some fixed station for a little more than a com¬ 
plete year, we shall not hesitate in the adjustment, by means of periodical functions, to 
leave out the overlapping observations, or to make nse of the means of the double valuee, 
giving them the weight of single observations. 

») In what Mows we write, for the sake of brevity, [pj] for [^]. 


18 



§ 60. Though of course the fabrication of observations is, in general, the greatest 
sin which an applied science can commit, there exists, nevertheless, a rather numerous and 
important class of cases, in which we both can and ought to use a method which just 
depends on the fabrication of such observations as might bring about the freedom of the 
theoretical elements. As a warning, however, against misuse I give it a harsh name; the 
method of fabricated observations. 

If, for instance, we consider the problem which has served us as an example in the 
adjustment, both by correlates and by elements, viz, the determination of the abscissae for 
4 points whose 6 mutual distances have been measured by equally good, bondfree observa¬ 
tions, we can scarcely after the now given indications look at the normal equations, 

Oj j-j-Oj3-}-« Sajj — lajj \x^ 

— Oj2Oj3-|-Oj4 « — Ifljj— la'3 — 1^4 
— Ojj—4 — las, 

0j4 <>24 Og4 Ifljj ljr2 “ "I" 

without immediately feeling the want of a further observation: 

0 = lajj 4- Ix^ + lajg + 1 

which, if we imagine it to have the same weight «. 1 as each of the measurements of 
distance *= ajr—will give by addition to the others, but without specifying the 
value of 0, 

+ + ^i 4 ** 

+ <^*8 + ^84 ** d»2 

0 — Ojg dajg 

0 Oi 4 O24 —O34 *» 4 aj 4 , 

and consequently determine all 4 abscissae as mutually free and with fourfold weight 
What in this and other cases entities us to fabricate observations is indeter¬ 
minateness in the original problem of adjustment — here, the impossibility of determining 
any of the abscissae by means of the distances between the points. When we treat 

such problems in exact mathematics we get simpler, more symmetrical, and easier solu¬ 
tions by introducing values which can only be determined arbitrarily; it is also in 
the theory of observation. But the arbitrariness gets here a greater extent, because not 
only mean values, but also mean errors must be introduced for greater convenience. And 
while we can always make use of a fabricated observation in indeterminate problems for 
the complete or partial liberation of the elements, we must here carefully demonstrate, 
by ciiticism in each case, that the fabrication we have used has not changed anything 
which was really determined without it. 



263 


In the abQ\e example, this i& seen m the first place by 0 disappearing trom all 
the adjusted values for the distances air— itj, and then by O’s own adjusted value, 
determined as the sum aji+ajj+ara+^n hnd leading only to the identity 0««0. The 
adjustment will consequently neither determine 0 nor let it get any influence on the 
other determinations, The mean errors show the same and, moreover, in such a way that 
the criterion becomes independent of whether 0 has been brought into the computation 
as an indeterminate number or with au arbitrary value, for, after the adjustment as well 
as before, we have for Oj Aj(0) « 1. The mk for 0 is ctmsequen^y « 0, and this is 
also generally a sufflcient proof of our right to use the method of fabricated observations. 

§ 61. The method of partial eHminations, 'When the nnmber of elements is 
large, it becomes a very considerable task to transform the normal equations and eliminate 
the elements. The difficulty is nearly proportional to the square of that number. Long 
before the elements would become so numerous that adjustment by correlates could be 
indicated, a correct adjustment by elements can become practically impossible. The special 
criticism is quite out of the question, the summary criticism can scarcely be suggested, and 
the very elimination must be made easier at any price. If it then happens that some of 
the elements enter into the expressions for some of the observations only, and not at all in 
the others, then there can be no doubt that the expedient which ought first to be employed 
is the partial elimination (before we form the normal equations) of such elements from the 
observations concerning them. These observations will by this means be replaced by certain 
functions of two observations or more, which will generally be hound and they will be 
so in a higher and more dangerous degree the fewer elements we have eliminated. By 
this proceedinir we may, consequently, imperil the whole ensuing adjustment, the foundation 
of which, we know, is unbound or free observations as functions of its elements. 

If now it must be granted that the difficulties can become so great that we cannot 
insist on absolute prohibition against illegitimate elimimtion^ we must on the other 
hand emphatically warn against every eliminatiop which is not ^formed through free 
functions, and much the more so, as it is quite possible, in a great many cases m which 
abuses have taken place, to remain within the strictly legitimate limits of the free functions, 
by the use of Hhe method of partial eliminations^^ 

This is connected with the cases, in which some of the observations, for instance 
aj... On, according to the theory, depend on certain elements, for instance ic,... y, which 
do not occur m the theoretical expression for any other of the observations. Our object is 
then, by the formation of the normal equations to separate ... 0 ^ jas a special series of 
observations. We begin by forming the partial normal equations for this, and then imme^ 
diately perform the elimination of x,... y irom them, without taking into consideration 
whether these equations alone would be sufficient for a determination of the other elepienfis. 

13 * 



As soon as x ...y m eliminated, the process of elimination is suspended. The trans¬ 
formed equations containing these elements (which now represent functions that are free of 
all observations, and functions which depend only on the remaining elements 2 ;,... u), are 
put aside till we come hack to the determination of a;... y. The other partially transformed 
normal equations, originating in the group ... o^, are on the other hand to be added, 
term by term, to the normal equations for the elements s;,... formed out of the remain¬ 
ing observations, before the process of elimination is continued for these elements. 

That this proceeding is quite legitimate becomes evident if we imagine the 
elements a.*.y transformed into the elements n:'.. • which are free of sr...t4, and then 
imagine af... y inserted instead of a?... y in the original equations for the obsenations. 
For then all the sums of products with the coefficients of a/. . y* will identically become 
« 0, and the sums of squares and sums of products for the separated part of the observa¬ 
tions will, as addenda in the coefficients of the normal equations (compare (57)), come out, 
immediately, with the same values as now the transformed normal equations. 

As an example we may treat the following series of measurements of the position 
of 3 points on a straight line. The mode of observation is as follows. We apply a millimeter 
scale several times along the straight line, and then each time read off by inspection with 
the unaided eye either the places of all the points against the scale or the places of two 
of them. The readings for each point are found in its separate column, and those on the 
same roir belong to the same position of the scale. (Considered as absolute abscissa- 
observations such observations are bound by the position, of the zero by every laying 
down of the scale; but these bonds are evidently loosened by oor .taking up the position 
against the scale of an arbitrarily selected fixed origin yr as an element beside the abscissae 
of the three* points). All mean errors are supposed to be equal. 


Position 


Point 


of 

tbe Scale 

I 

n 

m 

1 

?9 

27^4 


2 

8*35 


54*95 

3 

7*9 

* 

54*5 

4 


21*16 

47*2 

5 


10*74 

36*7 

6 


4*06 

30*1 

7 

?1*45 

51*98 

78*06 

8 

32*9 

53*5 

79*5 

9 

9*6 

30*3 

56*22 

10 

20*16 

40*78 

66*8 

11 

18*9 

39*5 

65*56 


Miminated free Elements 

im^y, + 

31‘65 * ^2 "t” J 4*^ 
31*20 »= 2^3 4" i "1“ ^s) 

34*18 « y, +iK + ^»3) r 

2m^y,+i(x, + ,,) I 

17*08 « yj + _ 

53*83 ^7 + + ^ 

55*30 yg 4" 4 (®i + ®j + 

32*04 « y» 4 “ i(iPi 4 - «»+ »s) 1 
42*58 - yio+4(»i4-®f4-ais) I 
41.32 «» yj jH- J (»! 4 - 4” *») ^ 



265 


As the theoretical equation for the observation in the column has the foiui 

and every observation, therefore, is a function of only tiro elements, there is every reason 
to use the method of partial elimination. If we choose first to eliminate the y\ we have 
consequently to form normal equations for each of the 11 rows. Where only two points 
are observed these normal equations get the form 

Of “1“ 0* ^ 2yt Jir *1“ 

Or« yi + ^ 

0, * y, +»,; 

for three points the form of the normal equations is 

"t" ”h ^8 “ 3l^i 4" 4* ®si "1" ^8 

Oi 

Oj y* 4*^8 

y* 4* **^8* 

Of these equations those referring to the y, have given the eliminated free elements 
stated above to the right of the observations after the perpendicular. 

By subtracting these equations from the corresponding other equations we get, 
in the cases where there are 2 points: 

Of — ^(0f4"0«) 

0 # ^{0r4’0|) “» 

and in cases where there are 3 points: 

0i-J(0j4'08 4-0a) - 

«a-4K + <>2 + 08) « -4i»i-i®i + |»8- 
By forming the sum of these diferences for each column, and counting, on the 
right side of the equations, how often each element occurs with one other or with two 
others, we consequently get the ultimate normal equations: 

^ 37-71+ 

+ 206-69 - -2*,-2*,+“*,. 

The case is here simple enough to be solved by a fabricated observation. How is 
its most advantageous form found, when its existence is given? 

w-m+l+8' 28712. 


Answer: 



266 


after which we get the normal equations: 

ro- 168*98 > 


37*71 « 
^-0 + 206*69 - 


H-r 


8 

^8 1 


ajj » o-.25*38, jJj = o —4*77, and «- o+ 21*24. 

Prom these we now compute the y’s: 

Vi 
Vt 
yi 

y* 

yfi 

Vi 

We need not here state the adjusted values for the several observations, nor their 
differences, of which it is enough to say that their sum vanishes both for each row and 
for each column; their squares, on the other hand, will be found to be: 


32*295 — 0, 

y, =«= 56*80 - 0, 

33*72 - 0, 

yg *= 58*27 — 0, 

33*27 -0, 

yg — 35*01 — 0, 

25*945-0, 

yio*“ 45*55-0, 

15*485 — 0, 

yu— 44*29-0. 

8*845 — 0, 



I 

n 

m 

1 Total. 

•0002 

•0002 


*0004 

1 


•0001 1 

2 

1 


1 

2 


2 

2 

4 


6 

6 

12 


2 

2 

4 

9 

25 

4 

38 

1 

0 

1 

2 

9 

36 

9 

54 

1 

0 

1 

2 

1 

4 

9 

14 

Total: -0025 

■0077 

•0036 

•0138 



For the summary criticism we notice that the number of observations is 27, the 
number of the elements is 8+11—1» 13, divisor consequently 14 (one element being 
wholly engaged by the fabricated observation o). The unit of the mean error is therefore 

mm 

determined by £* — 0*0010, and the mean error on single reading ±0*032, which agrees 
well with what we may expect to attain by practice in estimates of tenth parts. 



267 


As to special criticism it is here, where the weights of the eliminated free 
functions are respectively 2 and 3 times the weight of the single observation, while the 
weights of a?i, aij, and Xq after the adjustment become respectively and very 

easy to compute the scales 


1 - 


h{o) 


1 - 


1 


Weight after the adjustment * 


With 759 as common denominator we find for the several scales and the sums of their 
most natural groups: 



I 

II 

III 


1 

327 

327 


654 

2 

331*5 


.331*5 

663 

3 

331*5 


331*5 

063 

4 


336 

336 

672 

5 


336 

336 

672 

6 


336 

336 

672 

7 

436 

442 

448 

1320 

8 

436 

442 

448 

1326 

9 

430 

442 

448 

1 1326 

10 

436 

442 

448 

1326 

11 

436 

442 

448 

1326 


3170 

3545 

3911 

10626 


The comparison with the sums of squares in the groups, divided by E\ shows then for 
point I 2'5 instead of ^ « 4*2 ± 1 / 8 T, for point 11 7*7 instead of 4-7 ± 1 ^, for 
point III 3-6 instead of 5*1 ±1/10^, for all positions of the scale with two readings 
2*8 instead of 5*3^1^10^, and for positions with 3 readings 11*0 instead of 8*7 
The limit of the mean error is consequently reached only in the group of point II, where 
( 7.7 — 4 - 7 )* « 9'0 < 9*4, and it is nowhere exceeded. We have a check by summing 
the scales: 

^ - 14-27-11-3 + 1. 


§ 62. In such cases in which the circumstances and weights of the observations 
are distributed in some regular way, this will often facilitate the treatment of the normal 
equations. The elimination of the elements and the transformation of the normal equations 
into such whose left hand sides can be regarded as unbound observations, as they are free 



268 


fonctioos of the original obserratioiis, need not always be so firmly connected with one anothei 
as in the ordinary method. If we, in a suitable way, take advantage ot regularity m the obser¬ 
vations, and thereby are able, to find a transformation which sets the normal equations free, 
then the determination of the several elements will scarcely throw any material obstacles 
in our way. Bnt in order to find out any special transformations, we must know the 
general form of the changes of the normal equations resulting from transformation of the 
original elements into such as are any homogeneous linear funtions of them whatever. 

If the equations for the unbound observations in terms of the original elements 
hare been 

the normal equations will be: 

lyo] - [pp]*+[pj]y+lF]* 

[qo\ - [jpla+(j2]y+[r]* 

[rol - [rp]ar-r['?]y + f»^l«i 

And if we wish to substitute new elements, f, and C* for the old ones, we make use of 
substitutions in which the original d&nenU are represented as functions of the new ones, 
therefore 

y »» I (114) 

z a=s 1 

The equations for the observations then hare the form 

Oi ^ + (115) 

The new normal equations may be formed from these, but the form becomes very cumbrous, 
the equation which specially refers to f being 

- [{ph^-fqh^+rh^)^? + [{phi+qh^+rhi)[pk^+gk^^^^ 

+ Mi+iK+rh^) ipii+qh + ^^s)] C 

The computation ought not to be performed according to the expressions for the coefficients 
which come out when we get rid of the round brackets under the signs of summation [ ]. 
But it is easy to give the lule of the compntation with full clearness. The old normal 
equations are first treated exactly as if they were equations for unbound observations, for 
a;, and respectively; expressed by the new elements, consequently by multiplication, 
by columns, by h^, A,, and h^ and addition; by multiplication by and and 
addition; and by mnltiplication by 1^, and and succeeding addition. Thereby, certainly, 
we get the new normal equations, but still with preservation of the old elements; 



269 


[{phi+qh^^rhj^)o] « [{ph^+qK j 

\{phi+qk^+rh^)o] -= [ipK’-\qhi'rk^)p]x+[[pk^+qk^^^^^ |(116) 

[iph+qh fr/^l - [(ph+qlt-^H^)p]x+[{pli+qli+rl^)q]y^[(pl^^ J 

The second part of the operation must therefore consist in the substitution of the 
new elements for the original ones in the nght hand sides of these equations. In order 
to find the coefficients of f, and we must therefore here again multiply the sums of 
the products, nm by rows^ by 

Aji Aj, Aj 
Aj, Ag 

and add them up, 

Example. It happens pretty often, for instance in investigations of scales for 
linear measures, that there is symmetry between the elements, two and two, Xr and a^.r, 
so that for instance the normal equation which specially refers to Xn has the same coeffi¬ 
cients, only ip inverted order, as the normal equation corresponding to of course, 
irrespective of the two observed terms [po] on the left hand sides of the equations. 
Already P. A. Hansen pointed out that this indicates a transformation of the elements 
into the mean values Sr « and their half differences dr ■■ in 

this case therefore the equations for the old elements by the new ones have the form 

Xr <>^8r + dr 
aSm-r Sr — 

and the transformation of the normal equations is, consequently, performed just by forming 
gums and differences of the original coefficients. If the normal equations are 

[flo] w ds; ^ 3y “I" 22! -j- Iw 
[bo] — 3a; + 6y+ 42: + 2tt 
(co] » 2a!-|-4y + 6^ + 3M 
[do] — la?-f2y + 32!4*4w, 

the procedure is as follows: 

H + [d«] - 5*+ 5y+ 5* + 5it = lo5^ + 10^^ 

[6o]4-[eii] - 53e+10y+10*+5u - 10i|^ + 20^ 

M-M - 3*+ ly- l«-3« ~ 6^+2^ 

[4«]—{coj »> 1*+ 2y— 2zr-lv « 2^-|p— 

14 



kz in this example, we always sticceed in separating the mean values from the halt 
iiferences, as two mutually free systems of functions of the observations. 

§ 63. The great simplification that results when the observations are mere repe<* 
[itions, in contradistinction to the general case when there are varying circumstances in 
bbe observations, is owing the fact that the whole adjustment is then reduced to the 
letermiuation of the mean values and the mean errors of the observations. Before an adjust*^ 
ment, therefore, we not only take the means of any observations, which are strictly speaking 
repetitions, but we also save a good deal of work in the cases which only approximate to 
repetitions, viz. those where the variations of circumstances have been small enough to idlow 
05 to neglect their products and squares. It has not been necessary to await the systematic 
development of the theory of obsemtions to know how to act* in such cases. 

When astronomers have observed the place of a planet or a comet several times 
In the same night, they form a mean time of observation a mean right ascension n, 
and a mean declination d, and consider a and d the spherical co-ordinates of the star at 
the time t 

With the obvious extensions this is what is called the mmol method, the 
most important device in pracMcal adjustment. Such observations whose essential circum-r 
stances have ^^mall” variations, are, before the adjustment, brought into a normal place, by 
forming mean values both for the observed values themselves and for each of their essential 
circumstanoes, and on the supposition that the law which connects t\3 observations and 
circumstaDces, holds good also, without any change, with respect to their mean values. 

Much trouble may be spared by employing the normal place method. The question 
is, whether we lose thereby in exactness, and then how much. 

We shall first consider the case where the unbound observations o are linear 
functions of the varying essential circumstances x, .... the equation for the observa¬ 
tions being.' 

Ajfo) « a-f-diP-|-.... 

With the weights v we form the normal equations: 


[fjo ] a[o]-f ^d[vz] 

(117) 

[wo] *1 a[w;] -1- d[f?a:*] <i[w! 2 f] j 

(118) 

[zzo] — a[vz] ^h{vzx] 4- ‘ . + rf[w*]. 1 



If the whole series of observainons is gathered into a single normal place, 0, 
ootresponding to the circumstancf^ ^ and with the weight F, we shall have: 




?7/ 


V 

FO = [tJoj 

VX - [m] 


VZ - [vz \, 

and as 

0 « a + 6A%.. . [^dZ, (IHa) 

this normal place'will exhaust the normal equation (117) ooirespondm^ to the constant 
term, both with respect to mean value and mean error. But if we make the other normal 
equations free of (117), we get, by the correct method of least squares: 

[t,(o-0)(a:-l)] « 5[t,(a:--.AV]4-.. 4.rfft?(aj-I)(0-~^l j 
. (llSa) 

[Ho-0)(z--Z)] ^ b[v{ic^X]{z^Z)l + ,..+d[v(z-^Zn I 

for the determination of the elements b . d, and these determinations are lost complete)) 
if the whole series is gathered into a single normal place. Certainly, the coefficieiiis of these 
equations (118a) are small quantities of the second order, if the and j^-r-i^are 
small of the first order. 

If, on the other hand, we split up the series, forming for each part a norma) 
place, and adjusting these normal places instead of the observations according to the 
method of the least squares, then the normal equation corresponding to the constant 
term is still exhausted by the normal place method; and besides this determination of 
a+hXdZ the normal place method now also affords a determination of the other 
elements b...d, in such a way, however, that we suffer a loss of the weights for their 
determination. This loss can become great, nay total, if the normal places are selected in 
a way that does not suit the purpose; but it can be made rather insignificant by a 
suitable selection of normal places in not too .small a number. 

Let us suppose, in order lo simplif) matters, that the observations have onlj one 
variable essential circumstance x, of which their mean values are linear functions, con* 
sequently 

k^{o) a^bx^ 

and that the x-sare uniformly distributed within the utmost limits, Jq and ; we then let each 
normal place encompass an equally large part of this interval, and we shall find then, this 
being the most favourable case, with normal places, that the weight on the adjusted value of 
the element b becomes 1 —(,;)* i if ^ correct adjustment by elements the corresponding 
weight is taken as unity. The loss is thus, at any rate, not very great And it can be 
made still smaller, if the distribution of the essential circumstance of the observations is 

14 * 





Z7Z 


uneTen, and if we can get a normal place everywhere where the observations become 
particularly frequent, while empty spaces separate the normal places from each other. 

The case is analogous also when the observations are still functions of a single 
or a few essential circumstances, but the function is of a higher degree, or transcendental. 
For it is possible also to form normal places in these cases; and we can do so not only 
when the variations of the circumstances can be directly treated as infinitely small within 
each normal place, which case by Taylor’s theorem falls within the given rule. For if we 
have at our disposal a provisional approximate formula, y -^f(x}i and have calculated the 
deviation from this, o — y, of every observation (considering the deviations as observations 
with the essential circumstances and mean errors of the original obsenations), then we 
can use mean numbers of deviations for reciprocally adjacent circumstances as corrections 
which, added to the corresponding values from the approximate formula, give the normal 
values. Further, it is required here only that no normal place is made so comprehensive 
that the deviations within its limits do not remain linear functions of the essential 
circumstances. 

Also here part of the correctness is lost, and it is difficult to say how much. The 

loss is, under equal circumstances, smaller, the more normal places we form. With twice 

(or three times) as many normal places as the number of the unknown elements of the 
problem, it will rarely become perceptible. With due regard to the essential circumstances 
and the distribution of the weights we can reduce it, using empty spaces as boundaries 
between the normal places. 

A suitable distribution of the normal places also depends on what function the 
observations are of tfieir essential circumstances. As to this, however, it is, as a rule, 
sufficient to know the behaviour of the integral algebraic functions, as we generally, when 
we have to do with ftmctions which are essentially different from these, will try through 
transformations of the variables to get back to them and to certain functions which 
resemble them in, this respect 

We need only consider the cases in which we have only one variable essential 

drcumstauce, of which the mean value of the observation is an algebraic function of the 

degree. We are able then, on any supposition as to the distribution of the observations, 
0 , and thtir essential circumstances, x, and weights, o, to determine r+1 substitutive 
observations, 0, together with the essential drcomstances, J, and weights, T, belonging 
to them, in such a way that they treated according to the method of the least squares 
will give the same results as the larger number of actual observations. The conditions are: 

[oe] ^ Fo -j-... -j- OfFp 

% 

Ko]-.i;o,Fi+...+i;o,Fp 


( 119 ) 




273 


and 


[ii^^v]^]ex+,..+xrvr. 


( 120 ) 


These 3r+2 equations are not quite sufficient for the determination of the 3r-}-3 
unknowns. We remove'the difficulty in the beet way by adding the equation; 

The elimination of the F’s (and O’s) then leads to an equation of the r+1 degree, whose 
roots Xf are all real quantities, if the given a;'s have been real and the v's 

positive. When the roots are found, we can compute, first ... IV and afterwards 
Oo,... Of, by means of two systems of r-f I linear equations with r-f 1 unknowns. 

If, for instance, the essential circumstances of the actual observations are contained 
in the interval from —1 to -f-l. and if the observations are so numerous and so equally 
distributed that they may be looked upon as continuous with constant mean error every-- 
where in this interval; if, further, the sum of the weights =* 2; then the distribution oi 
the substitute e observations will be symmetrical around 0, and, ior functions of the lowest 
tiegrete, 


i-V- -000 

ir= 2 -ooo’ 

jx --577, 

+ •677 




ir- i-ooo, 

1-000’ 




fX - '775. 

-000, 

•+■•775 



\r~ -556, 

•889, 

•556’ 



/jr- --sei. 

■^-340, 

+ •340, 

+ *861 


IF- m 

•652, 

•662, 

•348’ 


|i:-._.906, 

-•538, 

•000, 

+ •538, 

+ •906 

1 7 •237, 

•479, 

•569, 

•479, 

•237 ’ 

jX ^ --932, 

-•661, 

-•239, 

+ •239, 

+ •661, -*932 _ 

i 7 - -171. 

•361, 

•468, 

•468, 

•361, -171 ' 

(X - -949, 

- -742, 

-•406, 

•000, 

+ '406. +-742, +-949 

I 7 -129, 

•280, 

•382, 

•418, 

•382, '280, '129 


If, in another example, the distribution of the observations is, likewise, continuous, 
but the weights within the element dz proportional to e-'i*’, consequently symmetrical with 
maximum by x » 0, then the disidbution for the lowest degrees, the only ones of any 
practical interest, will be 







•{?: 

•000 






2 '000’ 






>(f: 

-I’OOO, 

-j-1*000 





I'OOO, 

1 -000’ 






-1-732, 

•000, 

+1-732 




\v^ 

•338, 

1-333, 

•333 * 




./A’- 

-2*334, 

- -742, 

+ -742, 

+2-334 



•092, 

•908, 

•908, 

•092’ 



|.Y- 

-.2-857, 

-1*356, 

•000, 

+1-35G, 

4*2-857 



•023, 

•444, 

1-067, 

-444, 

•023 ’ 



-3-324, 

-1*889, 

- -617, 

+ -617, 

+1-880, 

+3-324 

^|F- 

•005, 

*177, 

•818, 

•818, 

*177, 

•005’ 


-3*750, 

-2-367, 

-1-154, 

•000, 

+M54, 

+2-367, 

®\F = 

•001, 

•062, 

•480, 

•914, 

•480, 

•062, 


If we were able now to represent these substiiuiive observations as normal places, 

then we should be able also, by the use of such tables in analogous cases, to prevent any 

loss of exactness. It would be possible entirely to evade the application of the method of 

the least squares; we had but to form such qualified normal places is just the same 

number as the adjustment formula contains elements that are to be determined. This, 
however, is not possible. Certainly, we can obtain normal places corresponding to the 
required values of the essential circumstance, but we cannot by a simple formation of 
•mean numbers give them the weight which each of them ought to have, without employing 
some of the observations twice, others not at all. By taking into consideration how much 
the extreme normal places from this reason must lose in weight, compared to the sub¬ 
stitutive observations, we can estimate how many per cent the loss, in the worst case, 
can amount to. In the first of our examples we find the loss to be 0, for r pp* 0 and 
r — 1; but for r == 2 we lose 15, for r «* 3 we lose 19, for r 4 we lose 20, and 
for greater values of r 21 p. c. 

Example. Eighteen unbound observations, equally good, ;i,(o) — correspond 
to an essential circumstance whose values are distributed as the prime numbers p from 
,1049 to 1171, Taking (j>—1105): 100 a; as the essential circumstance of the observa¬ 
tion 0 , we have; 




215 


.r 

0 1 

X 

0 

j 

0 

- *&() 

-•41 

-•14 

-•15 

4- *18 

-•24 

^•54 

+ *50 

-•12 

-•32 ' 

‘ +-24 

+ •09 

-•44 

-•03 

-*08 

+ •33 

^ j 

+ *4C 

+ *39 

-^•42 

-15 

i 

-•21 I 

+ •48 

+ •12 

-*3G 

+ •48 

+ ‘04 

+ •21 1 

! +*58 

-•24 

^*18 

+ •18 

h-12 

+ •40 1 

1 

-•39 


Dividing these observations into groups indicated by the horizontal lines, we gel 
the G normal places; 

X 0 weujht 


-•550 

1 -045 

2 

-•407 

+ •100 

3 

-•108 

-•034 

5 

+ •145 

+ •115 

4 

+ •470 

+ •255 

2 

1 *020 

-•315 

2 


if we suppose the mean values of the observations to be a function of the third, 
eventually second, degree of a*, we have by ordinary application 

of the adjustment by elements the normal equations: 

0-72 « 2l6‘00a-- 1*206 +29*98c + l'94(f 
~3*07 - -l-20« +29*986+ l*94cf 8‘U(f 
20'98rt+ 1*946+ 8*nc + l*21flf 
-1*44 « l*l)4a+ 8*116+ l*21c + 2*5Cd. 

By the Iroe eiimilious: 


(h72 

21G-00« 

- l-20i + 29-98<*+l-94«i 

— 3'03 **f 


29'97H2-Uc + a-12<i 

- l*70 - 


3'80<-+ -37d 

- -54 


■305rf 


</•*« + 

•09, 


6 -^ + 

• 40 , b' - --07. 


r arr — 

•30, - -47, 


rf « -1*77, 

where o', 6', o' are the coefficientB in the functions of second degree, obtained by pre^ 
supposing d ^ 0. 



276 


Now, by application of the normal places instead of the original observations, we 
obtain on the same suppositions the normal equations: 

6*72 216-00 fl - 120 5 + 29-45 c +1-87 d 

^ 2-84 -1.20 a + 29-45 H 1-87 c + 7-93 d 

- -54- 29-450+ 1-875+ 7-93c +1-14 
-1-57 - 1-87 a 7-93 h + 1*14 c + 2-45 d. 

By the free equations: 

6*72 - 216-00 fl - 1-20 5 + 29*45 c +1*87 d 
-2-80 - 29*445+ 2*03 c +7*94 

-1*26 - 3*77c+ *34^^ 

-•76 - -263 rf, 

we get: 

fl « + -07, a! - +-08, 

5 - + -69, 5^ - -*07, 

c — *07, c' — —-33, 

d - -2*88. 

A comparison between these two calculations, particularly between the leading 
coefficients in the free equations, shows that the loss of weight amounts to 1 —or 
14 per cent. But it ib only m the equation for d that the loso is so great-, in the equa¬ 
tions for 5 and c, respectively, it is only two and one per cent. 

Our normal places are very good if the function is only of the first or second 
degree; for the function of third degree they can be admitted even though the values of 
the elements a, 5, c, d have changed considerabl}. Poi functions of 4**^ or higher degrees 
these normal places would prove insufficient. 

§ 64. That graphical adjustment is a means which can carry us through gieat 
difficulties, we have shown already in practice by appljing it to the drawing of curves of 
errors. The remarkable powers of the eye and the hand must, like a deus ex machma, 
help us where all other means fail. 

Adjustment by drawing is restricted only by one single condition: if we are to 
represent a relation between quantities by a plane curve, there must be only two quantities; 
one of these, represented by the ordinate, is, or is consideied to be, the observed value; 
and the other, represented by the abscissa, is considered the only essential circumstance 
on’ which the observed value depends. 

Examples of graphical adjustment with two essential circumstances do occur, 
however, for instance in weather-charts. In periodic phenomena polar co-ordinates are 
preferred. But otherwise each observation is represented by a point whose ordinate and 



Z17 


abscissa arc, respectively, the observed value and its essential circumstance; and the adjust¬ 
ment is performed by free-hand drawing of a curve which satisfies the two conditions 
of being free from irregularities and going as near as possible to the several points of 
observation. The smoothness of the curve in this process plays the part of the theory, 
and it is a matter of course that we succeed relatively best when the theory is unknown 
or extremely intricate; when, for instance, we must confine ourselves to requiring that the 
phenomenon must be continuous within the observed region, or be a single valued function. 
But also such a theoretical condition as, for instance, the one that the law of dependence 
most be of an integral, rational form, may be successfully represented by graphical adjust¬ 
ment, if the operator has had practice in the drawing of parabolas of higher degrees. And 
we have seen that also such functional forms as have the rapid approximation to an asymptote 
which the curves of error demand, lie within the province of the graphical adjustment. 

As for the approximation to the several observed points, the idea of the adjust¬ 
ment implies that a perfect identity is not necessary; only, the curve must intersect the 
ordinates so near the points as is required by the several mean errors or laws of errors. 
If, after all, we know anything as to the exactness of the several observations before we 
make the adjustment, this ought to be indicated visibly on the drawing-paper and used 
in the graphical adjustment. We cannot pay much regard, of course, to the presupposed 
typical form and other pmperties of the law of errors, but something may be attained, 
particularly with regard to the number of similar deviations. 

If we know nothing whatever as to the exactness of the several observations, or 
only that they are all to be considered equally good, there can be only a single point in 
pur figure for each observation. In a graphical adjustment, however, we can and ought 
to take care that Ihe curve we draw has the same number of observed points on each 
side of it, not only in its whole extent, but also as far as possible for arbitrary divisions. 
If we know the weights of the observations, they may be indicated on the drawing, and 
observations with the weight n count »-foId. 

In contradistinction to this it is worth while to remark that, with the exception 
only of bonds between observations, represented by different points, it is possible to lay 
down on the paper of adjustment almost all desirable information about the several laws of 
errors. Around each point whose co-ordinates represent the mean values of an observation 
and of its essential circumstance, a curve, the curve of mean errors, may be drawn in 
such a way that a red intersection of it with any curve of adjustment indicates a devia¬ 
tion less than the mean error resulting from the combination of the mean errors of the 
observed value and that of its essential circumstance, if this is also found by observation, 
while a passing over or under indicates a deviation exceeding the mean error. Evidently, 
drawings furnished with such indications enable us to make ver^i good adjustments. 

1.5 



ZTB 


If the laws of errors both for the observation and for its circumstance are typical, 
then the curve of mean errors is an ellipse with the observed points in its centre. 

If, further, there are no bonds between the observation and its circumstance, then 
the ellipse of mean en*ors has its axes parallel to the ordinate and the abscissa, and their 
lengths are double the respective mean errors. 

If the essential circumstance of the observation, the abscissa, is known to free 
of errors, the ellipse of the mean errors is reduced to the two points on the ordinate, 
distant by the mean error of the observation from the central point of observation. In 
special cases other means of illustrating the laws of errors may be used. If, for instance, 
the mean errors as well as the mean values are continuous functions of the essential 
circumstance of the observation, continuous curves for the mean errors may be drawn on 
the adjustment paper. 

The principal advantages of the graphical adjustment are its indication of gross 
errors and its independence of a definitely formulated theoi^y. By measuring the ordinates 
of the adjusted curve we can get improved observations corresponding to as many values 
of the circumstance or abscissa as we wish, and we can select them as we please within 
the limits of the drawing. But these adjusted observations are Strongly bound together, 
and we have no indication whatever of their mean errors. Consequently, no other adjust¬ 
ment can be based immediately upon the results of a graphical adjustment. 

On the other hand, graphical adjustment can be very advantageously combined 
with interpolations, both preceding and following, and we shall see later on that by this 
means we can remedy its defects, particularly its limited accui aud its tendency to 
place too much confid^^*’ the observations, and too little in the theory, i. e. to give 
an under-adjustmerr. 

By drawing we ar.^u an exactness of only 3 or 4 significant figures, and that is 
frequently insufficient. The scale of the drawing must be chosen in such a way that the 
errors of observations are visible; but then the dimensions may easily become so large that 
no paper can contain the dra\^ing. In order to give the eye a full grasp of the figure, 
the latter must in its whole course show only small deviations from the straight line, which 
is taken as the axis of abscissae. This is a practical hint, founded upon experience. The 
eye can judge of the Lmoothness of other curves also, but not by far so well as of that 
of a straight line. And if the line forms a large angle with the axis of the abscissae, 
then the exactness is lost by the fiat intersections wjth the ordinates. Therefore, as a rule, 
it is not the original observations that are marked on the paper when we make a graphical 
adjustment, but only their differences from values found by a preceding interpolation. 

In order to avoid an under-adjustment, we must allow ^ of the deviations of the 
curve from the observation-points to surpass the mean errors. It is further essential that 



?7B 


the said interpolation is based on a rainiipuin number of observed data; and after the 
graphical adjustment has been made, it is s^e to tr} another interpolation using a smaller 
number of the adjusted values as the base of a nen interpolation and a repeated graphical 
adjustment. 

If the results of a graphical adjustment are required only in the form of a ^ble 
representing the adjusted observations as a function of the circumstance as argument, this 
table also ought to be based on an interpolation between relatively few measured values, 
the interpolated values being checked by comparison with the corresponding measured 
values. A table of exclusively measured values will show too irregular differences. 

When we have corrected these values by measuring the ordinates in a curve of 
graphical adjustment, they may be employed instead of the observations as a sort of normal 
places. It has been said, however, and it deserves to be repeated, that they must not be 
adjusted by means of the method of the least squares, like the normal places properly so 
called. But we can very well use both' sorts of normal places, in a just sufficient number^ 
for the computation of the unknown elements of the problem, according to the rules of 
exact mathematics. 

That we do not know their weights, and that there are bonds between them, will not 
here injure the graphically determined normal places. The very circumstance that even distant 
observations by the construction of the curve are made to 'influence each normal place, is an 
advantage. It is not necessary here to suffer auy loss of exactness, as by the other normal 
places, which, as they are to be represented as mean numbers, cannot at the same time be 
put in the most advantageous places and obtain the due weight. As to the rest, however, what 
has been said p. lOS'-rllO about the necessity of putting the substitutive observations in 
the right place, holds good also, without any alteration, of the graphical normal places. 

The method of the graphical adjustment enables us to execute the drawing with 
absolute correctness, and it leaves us full liberty to put the normal places where we like, 
consequently ‘also in the places required for absolute correctness; but in both these respects 
it leaves everything to our tact and practice, and gives no formal help to it. 

As to the criticism, the graphical adjustment gives no information about the mean 
errors of its results. But, if we can state the mean error of each observation, we are able, 
nevertheless, to subject the graphical ac^ustments to a summary criticism, according to 
the rule 

And with respect to the more special criticism ou systematical de\iations, the graphical 
method even takes a verj high rank. Through gi'aphical representations of the finally 
remaining deviations, o—w, particularly if we can also lay down the mean errors on the 
lame drawing, we get the sharpest check on the objective correctness of any adjustment. 

15* 



260 


From this reason, and owing to the proportionall\ blight ditti^ ulties attached to it, 
the graphical adjustment becomes particularly suitable where we are to lay down new 
empirical laws. In such cases we have to work through, to check, and to reject series 
of hypotheses as to the functional interdependency of observations and their essential 
circumstances. We save much labour, and illustrate our results, if we work by graphical 
adjustment. 

Of course, we are not obliged to subject observations to adjustment. In the pre¬ 
liminary stages, or as long as it is doubtful whether a greater number of essential circum¬ 
stances ought not to be taken into consideration, it may even be the best thing to give 
the observations just as they are. 

But if we use the graphical form in order to illustrate such statements by the 
drawing of a line which connects the several observed points, then we ought to give this 
line the form of a continuous curve and not, according to a fashion which unfortunately 
is widely spread, the form of a rectilinear polygon which is broken in every observed 
point. Discontinuity in the curve is such a marked geometrical peculiarity that it ought, 
even more than cusps, double-points, and asymptotes, to be reserved for those cases in 
wMch the author expressly wants to give his opinion on its occurrence in reality. 


m THE THEORY OF PROBABILITY. 

§ 65. We have already, in § 9, defined ^^probability'' as the limit to which — the 
law of the large numbers taken for granted — the relative frequency of an event approaches, 
when the number of repetitions is increasing indefinitely; or in other words, as the limit 
of the ratio of the number of favourable events to the total number of trials. 

The theory of probabilities treats especially of such observations whose events 
cannot be naturally or immediately expressed in numbers. But there is uo compulsion in 
this limitation. When an observation can result in different numerical values, then for 
each of these events we may very well speak of its probability, imagining as the opposite 
event ^ the other possible ones. In this way the theory of probabilities has served as 
the constant foundation of the theory of observation as a whole. 

But, on the other hand, it is important to notice that the determination of the 
law of errors by symmetrical functions may also be employed in the non-numerical cases 
without the intervention of the notion of probability. For as we can always indicate the 
mutually complementary opposite events as the ^Tortunate'* or ^^unfortunate” one, or as 
**Yes” and we may also use the numbers 0 and 1 as such a formal indication. If 



then we identify 1 with the favourable “Yes”-c\ent, 0 mill the unfavourable “No”, the 
sums of the numbers got in a series of repetitions will give the frequency of affirmative 
events. This relation, which has been used already in some of the foregoing examples, we 
must here consider more explicitly. 

If repetitions of the same observation, which admits of only two alternatives, give 
the result “Yes” «« 1 w times, against » times “No” 0, then the relative frequency 
for the favourable event is But if we employ the form of the symmetrical functions 
for the same law of actual errors, then the sums of the powers are 

^0 ==* Sj .... ma 8r^ m, (121) 

In order to determine the half-invariants by means of this, we solve the equations 


m « (tn+n)/ii 

w = 3»n*;£2-f 3w‘/i8 + (»w+«)/<4i 

and find then 


Ml 


= 


m 


m+n 

mn 

(JW 


wn («--»») 
” (»+»)> 


(132) 


mn(n*— 

Compare § 23, example 2, and § 24, example 3. 

All the half-invariants are integral functions of the relative frequency, which is 

itself equal to The relative frequency of the opposite result is — I—/*!? hy 

interchanging m and », none of the half-invariants of even degree are changed, and those 

of odd degree (^om upwards) only change their signs. 

In order to represent the connection between the laws of presumptive errors, we 

need only asspme, in (122), that m and n increase indefinitely, while the probability of the 

event becomes jp » , and the probability of the opposite event is represented by 

mm l^p Bn a. The luOf invariants are then: 

«+» ^ ^ 


^pq 


(123) 


Our mean values are therefore, respectively, the relative frequency and the probability itself. 



We must now first notice here that every half-invariant is its own fixed and 

simple function of the probability (the frequency). When a result of observation can be 

stated in the form of one single probabilit\, property so called, we have thereby given as 
complete a determination .of the hiTt of t > us as by the whole series of half-invariants. 
In such cases it is simpler to employ the theory of probability instead of the symmetrical 
functions and the method of the least squares. ^ 

The theory of probability thereby gets its province determined in a much more 
natural and suitable way than that employed in the beginning of this paragraph. 

But at the same time we see that the form of the half-invariants is not only the 

generd means which must be employed where the conditions for the use of the probability 
are not fulfilled, but also that, ivlthin the theory of probability itself, we shall require, 
particularly, the notion of the mean error. 

Even where the probability can replace all the half-invariants, we shall require all 
the various sides of the notions which are distinctly expressed in the half-invariants. Now 
we ha\e particularly to consider the probability as the definite mean value, now the point 
is to elicit the definite degree of uncertainty which is implied in the probability, and 
which is particularly emphasised in the mean error. Otherwise, we should constantly be 
tempted to rely on the predictions of the theory of probability to an extent far beyond 
what is justly due to them. Finally, we shall see immediately that the laws of error of 
the probabilities are far from typical, but that they have rather a type of their own, which 
must sometimes be especially emphasised. 

All this we shall be able to do here, where we have the half-invariants in reserve 
as a means of representing the theory of probability. 

§ 66. In particular, we can now, though only in the form of the half-invariants, 
solve one of the principal problems of the theory of probability, and determine the law of 
presumptive errors for the frequency m of one of the events of a trial, which can have 
only two events and which is repeated N times, upon the supposition that the trial follows 
the law of the large numbers, and that the probability p for a single trial is known. 

The equations (123) give us already the corresponding law of error for each trial, 
and as the totsd absolute frequency is the sum of the partial ones, we need only use the 
equations (35) to find: 

;,(m) - Np 
i,(«) - % - 

- Npqlq-p) » Np{l-p){l-2p) 
i 4 («) = Spq{f~4pq+p*) 


( 124 ) 



383 


The ratio of the mean frequency to the number of trials is therefore the probability itself. 

When f IS small the mean error differs little from the square root i/Jj^ of the mean 

frequency; and if p is nearly = 1, the mean error of the opposite event is nearlj equal to 

. When the probability, y), is nearly equal to the mean error will be about J 

The law of error is not strictly typical, although the rational function of the 

degree in 4(m) vanishes for r different values of p between 0 and 1, the limits included, 
/ 

so that the deviation from the typical form must, on the whole, be small. If, however, we 
consider the relative magnitude of the higher half-invariants as compared with the powers 
of the mean error 


and 


(125) 


the occurence of Npq in the denominators of the abridged fractions shows, not only that 
great numbers of repetitions, here as always, cause an approximation to the typical form, 
but also that, m contrast to this, the law of error in the cases of certainty and impossi¬ 
bility, when g « 0 and = 0, becomes skew and deviates from the typical in an infinitely 
high degree, while at the same time the square of the mean errors becomes »» 0. This 
remarkable property is still traceable in the cases in which the probability is either very 
small or very nearly equal to 1. In a hundred trials with the probabihty — 99} per ct. 
the mean error will be about « Errors beyond the mean frequency 99J cannot 
exceed and are therefore less than the mean error. The great diminishing errors mast 
therefore be more frequent than in typical cases, and frequencies of 97 or 96 will not be 
rare in the case under consideration, though hey must be fully counter-balanced by 
numerous cases of 100 per ct. The law of error is consequently skew in a perceptible 
degree. In, applications of adjustment to problems of probability, it is, from this reason, 
frequently necessary to reject extreme probabilities. 


XV. THE FORMAL THEORY OP PROBABILITY. 

§ 67. The formal theory of probability teaches us how to determine probabilities 
that depend upon other probabilities, which are supposed to be given. Of course, there 
are no mathematical rules specially applicable to computations that deal with probabilities, 
and there are many computations with probabilities which do not fall under the theory of 
probability, for instance, adjustments of probabilities. But in view of the direct application 



284 


of probabilities, not only to games, insurances, and statistics, but to all conditions of 
life, it will be understood that special importance attaches to the marks which show 
that a computation will lead us to a probability as its result, as this implies in part or 
in the whole a determination of a law of errors. The formal theory of probabilities rests 
on two theorems, one concerning the addition of probabilities, the other concerning their 
multiplication. 

I. The theorem concerning the addition of prohahilities can, as all probabilities 
are positive numbers, be deduced irom the usual definition of addition as a putting together: 
if a sum of probabilities is to be a probability itself, we must be allowed to look upon 
each of the probabilities that we are to add together as corresponding to its particular 
events. These events must mutually exclude one another, but must at the same time have 
a quality in common, to which, after the addition, our whole attention must be given. If 
the sum is to be the correct probability of events with this quality, the same quality must 
be found in no other event of tiie trial An “either—or” is, therefore, the simple gramma¬ 
tical mark of the addition of probabilities. The event i?,, whose probability is 
must occur, if either the result whose probability is or the quite different event 
whose probabilily is occurs, and not in any other case. If we require no other resem¬ 
blance between the events whose probabilities are added together, than that they belong 
to the same trial, their sum must be the probability 1, certainty, because ^then all events 
of the trial are favourable. Ji p be the probability for a certain event, q the probability 
against the same, then we have ^4-5 ^ 1, g = 1—^. If n events of the same trial be 
equally probable, the probability of each being = p, then the aggregate probability of 
these events is np. 

U. The tbeoreih concerning the muUiplicoHon of prohahilities can, as all proba¬ 
bilities are proper fractions, be deduced from the definition of the multiplication of frac¬ 
tious, according to which the product is the same proportional of the multiplicand as the 
multiplier is of unity. Only as probabilities presuppose infinite numbeis of trials, we shall 
commence by proving the corresponding proposition for relative frequencies, 

If;inpaBpj|)„j>jisa relative frequency, it must relate to a trial which, 
repeated N times, has given /avourable events in Np^ cases; and if p^, being also a relative 
frequency, takes the place of multiplier, then the corresponding trial Tj, if repeated Np^^ 
tixpes, most have given (Npi)p^ favourable events. Now in the multiplication p^Pipi^ 
p must be the relative frequency of the compound trials which out of the total number of 
N repetitions have given Np^p^ favourable events. The trials and must hoih have 
succeeded as conditional for the final event As the number N can be taken as large as 
we please, the same proposition most hold good for prababilities. 



285 


Tbe probability p ^ product of the probabilities p^ and p^, relates to 

the event of a compound trial, which is favourable only if both conditional trials, Ti and 
r,, have given favourable events; first the trial T^ must have had the event whose pro¬ 
bability is p^, aiid then the other trial T, must have succeeded in the event, whose 
probability, on condition of meess in Tp is p^. However indifferent the order of 
the factors may be in the numerical computation it is nevertheless, if a probability is 
correctly to be found as the product of the probabilities of conditional events, necessary 
to imagine the conditional trials arranged in a definite order. To prove this very important 
proposition we shall suppose that both conditional trials are carried out in every case of 
the compound trial. Let both and have succeeded in a cases, while only has 

succeeded in h cases, only Tj in c cases, and neither in d cases. Considering each of the 
two trials without any regard to the other, we therefore get ^ and 

^ ^ frequencies or probabilities of their favourable events. But in 

the multiplication for computation of the compound probability, and Pj are applicable 
only as multiplicands; tbe correct result p « is found by p—»Pr-*]; 

or by p according to the order in which the trials are executed, but not 

as p «== PjPj, unless a:i cid. Jut this proportion expresses that the frequency or 
probability of the trial Tj is not affected by the event of the trial This proportionality 
is the mark of freedom, if we consider the multiplication of probabilities as the determination 
of the law of errors for a function of two observed values whose laws of errors are given. 

Since impossibility is indicated by probability « 0, we see that the compound 
trial is impossible, if there is any of the conditional trials that cannot possibly succeed, 
i. e. if « 0 or Pj « 0 in P“=PiPj. The condition of certainty (probability ■« 1) in 
a compound trial is certainty for the favourable events of all conditional trials; for as pj 
and p, as probabilities must be proper fractions, p^p^^p=^l will be possible only when 
both Pi “ 1 and Pj «1, 

Example 1. When the favourable events of all the conditional trials, n m 
number, have the same probability p, the compound event, which depends on the success 
of all these, has the probability p". If by every single drawing there is the probability of 
I for “red” and J for “black”, the probabiUty of 10 drawings all giving red will be —. 

Example 2. Suppose a pack of 52 cards to be so well shuffled that the probabi¬ 
lities of red and black may constantly be proportional to the remainder in the slock, then 
the probability of the 10 uppermost cards being red will be 

26 25 24 23 22 21 20 19 18 17 |6|421W ^,,(10) _ 19 _ 1 

■" 52‘6r50'49‘48‘47'56‘45’i4‘i3 "* |521W|10 “ 56^88 ” 297S’ 

the being binomial functions. 

le 




266 


Example 3. Gompate the probabi ity that a man whoae age is a will be still alive 
after n years, and that he will die in one of the succeeding m years. 

If we suppose that is the probability that a man whose age is i will die before 
his next birthday, the probability that the man whose age is a will be alive at the end 
of n years will be 

P» =* (l—fia) (1— - 

The probability Qn of his &ien dying in either one or the other of the succeeding 
m years will be 

Qm Ja+« “h (1 ^a-f») {$a+ff+i “h (1—[2«+n+2 • “F (1 —g'afn+m-s) Ja+tt+WT-*]}» 

or 

1 — Qm (1—go+ii) (1—2a+*+l) • • • (I— 

The required probability of death after n years, but before the elapse of w+m years, is 
consequently : P« — P,+*. 

The most convenient form for statements of mortality is not, as we here supposed, 
a table of the probabilities g^ for all integral ages t, but of the absolute frequencies It of 
the men from a large (properly infinitely large) population who will reach the age of t. 
After this g, = ^1 —will only be a special case of the general 

answer: 

t> n _ ^«+» — ^a+«+« 

Example 4. We imagine a game of cards arranged in such a way that each 
player, in a certain order, gets two cards of the well-shuffled pack, and wins or loses 
according as the sum of the points on his two cards is eleven or not. For 5 players we 
use, for instance, only the cards 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10 of the same colour. 

What then is the probabOity of h players (named beforehand) getting 11 and not 
any of the 5—A others? 

Secondly, what probability, r*, is there that the A* player in succession will be 
the first who gets 11? 

Lastly, what is the probability, q, that none of the players will get 11 ? 

It will be found perhaps that it is not quite easy to compute these probabilities 
directly. In such cases it is a good plan to reconnoitre the problem by first bringing out 
such results as present themselves quite easily and simnly, without considering whether 
they are just those we require, in this case, for instan< *, we take the probabilities, that 
each of the first i players will get 11. 

We then attack the problem more seriously, and examine if there are not any 
simple functions of the probabilities we have found, which may be interpreted as. pro¬ 
babilities of the same or similar sort as those inquired after. 



ZB1 


§ 68, EepetitioDS of the same trial occur very frequently in problems solvable by 

the theory of probabilities, and should always by treated by means of a very simple and 

important law, the polynomial formula. 

Let us suppose that the various events of the single trial may be indicated by colours, 
and that, in the single trial, the probability of white is «?, of black b, and of red r. 

The probability that we shall get in 4?+^+^ trials x white, y black, and n red 

results, in a given order, is then 


The number of the events of this kind that difier only in order, is the trinomial 
coefficient 


z{x,y,z) 




which is the coefficient of the term in the development of 

And this same term 

T(x,ij,z)uf»b!f-r‘ (126) 


is the required probability Of getting white x times, black y times, and fed z times by 
(x^y’\'Z) repetitions. 

When the probabilities of all possible single results are known and employed, so 
that ttJ-f b+r +.., 1, and when the number of repetitions is w, we must consequently 

imagine developed by the polynomial theorem, and the single terms of 

ihe development ivill then give us the probabilities of the different possible events of the 
repetitions without regard to the order of succession. 

Example 1. if the question is of the probability of getting, in 10 trials in which 
there are the three possible eNenis of white, black, and red, even numbers x, y, and z of 
each colour, and if the probabilities of the single events are«?, b, and r, respectively, then 
we must retain the terms of (w+b + r)^® which have even indices, and we thus find: 


t(,io4.45M,8 (J*+r8)+210m« (b‘+6b-c*4 f') + 210w* (b»4 15b*rH-l5b*H+r«)+ 
+4o«jS(bH28bVH 70bV‘+28bV+;-8)+b»H45bV*+210bV+210bV«+45bV+ ^ 

44.^)104.- 

- \ {l4-(l-2w)>»+(l-24)">+(l-S'-)"’}- 

The probability, consequently, is always greater than but only a little greater, unless 
the probability of getting some of the events in a single trial, is very small. 

Example 2. Peter and Paul play at heads-or-tails (i. e. probability j for and 
against)! But Peter throws with 3 coins, Paul only with 2, and the one wins who gets 

16* 



m 


the greatest number of '^heads''. If both get the same number of heads they tlirow again, 
as often as may be necessary. What is the probability that Peter will win? 

If we write for Peter’s probability for and against throwing heads { and 
g'j^ for Paul’s «= J and then we should develop (Pi+5i)“*(Pj+2s)*i 

the terms in which the index of is greater than that of are in favour of Peter; 
those in which the indices are equal, give a drawn game; and those in which the index 
of jPs greater than that of p^, are in favour of Paul. For the single game there is the 
probability 

for Peter of A, 

♦ B 

for a drawn game of 

for Paul of 

As the probabilities are distrilSuted in the same way, when they play the games over 
again, we need not consider the possibilities of drawn games at all, and we find ^ as 
Peter’s final probability. 

Example 3. A game which is won once out of four times, is repeated 10 times. 
What 18 the probability of winning at most 2 of these? 

551124 
1048576 * 

§ 69. It often occurs that we inquire in a general way concerning a probability, 
which is a function of one or more numbers. Often it is also easier to transform a special 
problem into such a one of a more general character, where the unknown is a whole table 
Pv Pv P 3 > **’ P* probabilities, the suffixes being the arguments of the table. And 
then we must generally work with implicit equations, /(jpj,.. 0, particularly such 

as hold good for an arbitrary value of », i. e. with difference-equations. Integration of 
finite difference-equations is indeed of so great importance in the art of solving problems 
of the theory of probabilities, that we can almost understand that Laplace has treated 
this method almost as the one to be used in all cases, in fact as the scientific quintessence 
of the theory of probabilities. 

Since finite difference-equations like differential equations cannot as a rule be inte¬ 
grated by known functions, we can in an elementary treatise deal only with the simplest 
cases, especially such as can be solved by exponential functions, namely the linear difference- 
equations with constant coefficients. As to these, it is only necessary to mention here 
that, wh^ 

+ • • • + “ 0 (n being arbitrary), 

the solution is given by 

P» *=* ^iVi -f’ ... -f- hmfm ♦ 


m 



^89 


where n,.. are the roots m the equation 


. . 4-ro 0 , 


while ii, are integration-constants whenever the coiresponding roots ociiu singly; 
but rational integral functions with arbitrary constants, and of the degree £—1, if the 
corresponding root occurs i times. 


I shall mention one other means, however, not only because it can really lead 
to the integration of many of the difterence-equations which the theory of probabilities 
leads to, particularly those in which the exponential functions occur in connection with 
binomial functions and factorials, but also because it has played an important part in the 
conception of this book. 

The late Professor L. Oppermann, in April 1871, communicated io me a method 
of transformation, which 1 shall here state with an unessential alteration. 

A finite or infinite series of numbers 

can univocally be expressed by another: 

Wo =« W0 + M1+ M2+ t • • ] 


— 4^4 — .. 

^ U -f- 3^3 -j- 6M4 “f* • 1 • 

Wj « -- ... 


(128) 


W 4 «. 


«4 + •• 


w, « (— 


where the sum i’ may be taken from —00 to -l-oo» provided that — 0 when p > ». 
In order, vice versA to compute the u'% by means of the w\ we have equations of just 
the same form: 


Wo=^*^o + Wi4- Wj+ 1 O 3 + 

— 2wg — 3/Cg ^ 4 m;4 — .. 

II 2 ^2 "I" ^Wg -f* 6 W 4 ’j' .. 

Ifg * - .. 

U 4 « W^+.. 


(129) 


Here, as in (17) and (18), the general dependency between the u, and Wj can be 
' expressed in a single equation, be means of an independent variable z. From (129) we 
get identicallj 

1*0 + «!«*+• • • “ Wq -f + (1—e*)*w, ... 

if we here put then 1—ei'« will reduce (128) to an equation of 

the same form. 






m 

If Ui is the frequency or probability of i taken as an observed value, then also 
^o"h“i^ + **8^"r *•' ” ^oT^ + -j72 "I” “■ 

s^e ‘ » Wo + (l-e'IWi + + . 

illustrate the relations of the values in Oppermann’s transformation to the half-invariants 
and sums of powers. In particulai we have 

“o 

Wq W’o^ 

W,(/Co4-«i)K+2Wi) 

+ - ^7 • 

If now ... tt, are a seiies of probabilities or other quantities which depend 
on their suffix according to a fixed law, and if we know this law only through a difference- 
equation, then Oppermann's transformation of course leads only to a differeuce^equation 
for i^’o, u’j, ... iOn as function of their suffix. But it turns out that, in problems of 
probabilities, this equation pretty often is easier to deal with than the original one (foi 
instance the more difificolt ones in Laplace's collection of problems). If we can look upon 
a probability u,as the functional law of errors for i as the observed value, then w expresses 
the same law of errors by symmetrical functions, and frequently we want nothing more. 
If we have to reverse the process to find % itself, the series are pretty simple if w is 
simple; but they are often less favourable for numerical computation, as they frequently 
give the unknown as a difference between much larger quantities. There exists a means 
of remedying this, but it would carry us too far to enter into a closer examination of the 
question here. 

Example 1. I throw a die, and go on throwing till I either win by getting ^^one" 
twice, or lose by throwing “two” or “three”. If the game is to be over at latest by the 

throw, what is my probability of winning? If the number of throws is unlimited, 

what is the probability of another “one” appearing before any “two” or “three”? 

Pour results are to be distinguished from one another. At any throw, say the 
the game can in general be won, lost, half won (by only one “one”), or drawn. Let the 
probability of the throw resulting in a win be of the same resulting in a loss be 

in half win s,, and in a drawn game be n, then » 0, Si »« i, and 

Thus the probability of a second throw is and, generally, the probability of an throw 
+^(- 1 * It is easy to express i?,, g,, f,i and a, in terms of and and also 


« - 

H * 

- 



291 


Pt^i^ fift-i’ ^t-i*** Vj ‘V 2 ’ elimination then the difference- 

equations can be found. 

When we replace j? or ^ or e or r by x the difference-equation can be written in 
the compeon form 

aJi — Xi—i 4' Ot 

which is integrated as 

X, (a+bi)2^^; 

for r we have the simpler form 

?*» - Jn-rl. 


we get 


When, by the probabilities of the first throws, we hare determined the 


Pr 

Si > 


u 

2 “< 
3 ^ ’ 


and 


We then have the formulae . . 4 -^, and Qn^q{+ • • • 4- J«* for the 

probabilities of making the winning or losing throw, and we get 

« i. and - 1 

Fn+Qn I 3(2"-l)-n Pa. 4 -e« 9’ 

Example 2. In a game the probability of winning is m. The same game is 

repeated a great many, times. If it then happens at least once in this series that m 

successive games are won, you get a prize. What is the probability of this? In a 

game of dice, where ® «- J, what is the probability of getting a series of 5 ‘‘sixes” in 

10000 throws? 

It will be simplest to find tlie probability, (j^^l that the prise will not be 
got in the first r repetitions. The difference-equation for this is 

«,+„+, - 3r+, + (!-«) ®"3r - 0 (*) 

or 

0 % - «,+„-(l-®){3r+,-. +®?r+«-.+ ■••+»’-’«,+,+ ®"-'Jr} " 0* (»*) 

where (b) is the first integral of (a), (As well as (a) we can directly deroensirate fb). 
How?). Hence 

Jr ” «(/>;+•••+<•»/>;< 

where Ct,.. Cm are constanls, which as well as »0 must be determined by meini of 



2$2 


?o ?i ** - • • =** ?m-i !?m "* 1 pi pm are the roots of an irreducible 

equation of the de^ee, which is got from 

pm+i « ©m+l — ©« (C) 

by dividing out ^— 0 . The largest of these roots (for small 0's or large ms) will be 
only a little less than 1; a small negative root occurs when tn is even; the others are 
always imaginary, and they are also small 

In the actual computation it is highly desirable to avoid the complete solution of (c). 
This can be done, and this problem will illustrate a most important artifice. We may 
use the difference-equation to compute a single value of the unknown function by means 
of those which are known to us from the conditions of the problem, and then successive 
values of the unknown function by means of those already obtained; here, for instance, 
(b) enables us to get in terms of q^, ... Then we get either by again ap¬ 
plying (b) to ^ 2 ,... or by applying (a) to and (or best in both ways for the 
sake of the check), etc. 

It is evident that the table of the numerical values of the function which we can 
form in this way, cannot easily become of any great extent or give us exact information 
as to the form of the function, But we are able to interpolate, and, when the general 
form of the function is known (as here), we may be justified in using extrapolation also. 
In our example we need only continue the computations above described until the 
term in •• •» corresponding to the greatest root dominates the others to 

such a degree that the first difference of Log q^ becomes constant, and the computation of 
q^ for higher indices can then be made as by a simple geometrical progression. In the 
numerical case « 1*004078 x (0*9998928)*“; 1 — qtm « 0*6577. 

Example 3. A bag contains n balls, a white and black ones. A ball is 
drawn out of • the bag and a black ball then placed in it, and this process is repeated y 
times. After the y^ operation the white and black balls in the bag are counted. Find 
the probability M^(y) that the numbers of white balls will then be x and the black 
ones n—as. 

We have ^ 

«.{y) - 

and 

Mr(0) « 0, except Wfl(O) «= 1. 

By Oppermann’s transformation we find 


2 taken from x 

My) ^ 


» —00 to a?**;* +00, or 


a;+l-af 




n 



^3 

The limit? of x under S being infinite, x + l can be replaced by x, consequently 
This difference-equation, in which y is the variable, may easily be integrated. As we have, further, 

tr,(0) - (-iym 

we get 

«.(y) - 

By Oppermann's inverse transformation we find now: 

%(y) - (-WW-(-i)'-/9.{*)-(—)' 

S taken from x^^cc toajw+oo. This expression 

«*(y) - 

has the above mentioned practical short-comings, which are sensible particularly if n, a—x, 
or y are large numbers; in these cases an artifice like that used by Laplace (problem 17) 
becomes necessary. But our exact solution has a simple interpretation. The sum that mul¬ 
tiplies (a?) in w,(y), is the (a-a;)*^ difference of the function J*', and is found by a table 
of the values as the final difference formed by 

all these consecutive values. We learn from this int^pretation that it is possible, if not 
easy, to solve this problem without the integration of any difference-equation, in a way 
analogous to that used in § 67, example 4. 

If we make use of to give us the half-invariants for the same lav 
of errors m is expressed by Ug{y)^ then we find for the mean value of x after y drawings 



and for the square of the mean error 


XYI. THE DETERMINATION OP PROBABILITIES A PRIORI 
AND A POSTERIORI. 

$ TO. The compatitions of probabilities with which we hare been dealing in the 
foregoing chapters have this point in common that we atwafs aseume one or several pre- 
babilitles to be given, and then dedoce from them the required ones. If new we tsk, how 

IT 



m 


we obtain those “given” probabilities, it is evident that other means are necessary than 
those which we have hitherto been able to mention, and provisionally it must be clear 
that both theory and experience must cooperate in these original determinations of proba¬ 
bilities. Without experience it is impossible to insure agreement with reality, and without 
theory in these as well as in other determinations we cannot get any firmness or exactness, 
in determining probabilities, however, there is special reason to distinguish between two 
methods, one of which, the a priori method seems at first sight to be purely theoretical, 
while the other, the a pceteriori methfid, is as purely empirical. 

§ 71. The a priori determination of probabilities is based on estimate of equality, 
inequality, or ratio of the probabilities ol the several events, and in this process we always 
assnme the operative causes, or at any rate their mode of operation, to be more or less 
known. 

On the one hand we have the typical cases in which we know nothing else with 
respect to the events but that each of them is possible, and in the absence of any reason 
for preferring any one of them to any other, we estimate them to be equally probable 
though certainly with the utmost uncertainty. For instance: What is the probability of 
seeing, in the course of time, the back of the moon? Shall we say ^ or ^? 

On the other hand we have the cases — equally typical, but far more important — 
in which, by virtue of a good theory, we know so much of the causes or combinations of 
causes at work that, for ea^h of those which will produce one event, we can point out 
another (or n others) which will produce the opposite event, and which according to the 
theory must occur as frequently. In this case we must estimate the'prohability of the 
result at j and ^ respectively, and if the conditions stated be strictly fulfilled, such a 
determination of probability will be exact. 

But even if such a theory is not absolutely unimpeachable, we can often in this 
way obtain probabilities, which are so nearly exact and have such infinitely small mean 
errors, that we may very well make use of them, and compute hrom them values which 
may be used as our theoretically given probabilities. We are not more strict in other 
kinds of computations. In astronomical adjustment, for instance, it is almost an established 
practice to consider aB times of observation as theoretically given. Their real errors, how¬ 
ever, will often give occasion to sensible bonds between the observed co-ordinates; but the 
fact is that it would require great labour to avoid the drawback. 

Such an a priori determination of probabilities is particularly applicable in games. 
For it is essential to the idea of a game that the rules must be laid down in such a 
way that, on the one hand they exclude sdl computation beforehand of the result in a 
particular case, while on the other hand they make a pretty exact computation of the pro¬ 
babilities possible. The procedure employed in a game, e.g. throwing of dice or shuffiing 



205 


of cards, ought therefore to exclude all circumstances that might permit the players to set 
causes in train, which could bring about or further a certain event (corriger la fortune). 
But also those circumstances ought to be eliminated, which not only by their incaloulability 
make a judgment of the probabilities very insecure, but, above all, make it depend on the 
theoretical insight of the parties. Otherwise the game will cease to be a fair game and 
will become a struggle* The so-called stock-jobbing is rather a war than a game. 

When the estimate of the probabilities depends essentially upon personal knowledge, 
we speak of a su^edm prohahility. This too plays a great part, especially in daily life. 
The fear which ignorant people have of all that is new and unknown, proves that they 
understand that there is a great uncertainty in the estimate, and that it is greater for 
those who know but a little, than for those who know more and are therefore better able 
to judge. 

Eoulette may be taken as an example of the objective probability which arises in 
a well arranged game. A pointer turns on an almost frictionless pivot and pointe to the 
scale of a circle whose center is in the pivot. The pointer is made to revolve quickly, 
and the result of the game depends on where it stops. If the pointer stops opposite a space 
— suppose a red one -- previously selected as favourable, the game is won. 

There we have as essential circumstances; 1) the length of the arc which is tra¬ 
versed, this being determined by the initial velocity and tbe friction, 2) the initial position, 
and 3) the manner in which the circle is divided. 

The length of the arc is unknown, especially when we take care to exclude very 
small velocities, and when the Motion, as already mentioned, is very slight. So much 
only may be regarded as given, that the frequency of a given length of the arc must, as 
function of this length, be expressed by a functional law of errors of a nearly typic^ form. 
For the frequency must go down, asymptotically, as far as 0, both below and above limits 
of the arc which will be separated by many full revolutions of the pointer, and with at 
least one maximum between these limits. If now, for instance, it depended on, whether 
the arc traversed was greater or smaller than a certain^ value, the apparatus would be in¬ 
expedient, it would not allow any tolerably Irustworthy a priori estimate. But if the 
winning space (or spaces) is small in proportion to the total circumference and, moreover, 
repeated regularly for each of the numerous revolutions, then the a priori determination 
of the probabilities will be even very exact. For an area ABab, bounded by any finite, 


continuous curve whatever (in the present ease the curve of 
errors of the different possible events), by tbe axis of abscissae, 
and two ordinates, can always as a first approximation be 
expressed as the sum of numerous equidistant small areas 
^ constant base, multiplied by the 



rr‘ 

17* 



296 


infcerTal ... and divided by the base pp'x^q(^^ ... And if we speak of 

the total area of a curve of errors^ then the series of which the first term is this approx¬ 
imation, is even very convergent, in sach a degree as ^(a;) «l-f » + + • • • 

for small x, and the said approximation is sufficient for all practical purposes. 

That the initial position of the roulette is unknown, does not essentially change 
the result of the foregoing, viz. that the probability of winning is This uncertainty 
can only cause an improvement of the accuracy of this approximation. If we may assume 
that the pointer will as probabl} start from any point in the circle as from any other, this 
determination ^ will even be exact, without any regard to the special kind of the un¬ 
known function of frequency. 

The ratio of the winning space on the circle pf to the whole circumference pq, 
the third essential circumstance, cannot be determined wholly a priori, but demands a 
measurement or a counting whose mean error it is essential know. 

The a priori determination of probability can thus, according to circumstances, 
give results of the most different vdues, from the very poorest through gradual transition 
up to such exact probabilities as agree with the suppositions in § 65 seqq., and permit the 
probability to replace the whole law of errors for our predictions. But what the a priori 
method cannot give, is a quantitative statement of the uncertainty which affects the 
numerical value of the probability itself. Only when it is evident, as in the example of 
tiie roulette, that this uncertainty is infinitely small, can we make use of a priori proba¬ 
bilities in computations that are to be relied on. If in the work and struggles of our life, 
we cannot entirely avoid building on altogeth r uncertajn and subjective a priori estimates,^ 
great caution is necessary, and in order not to overdo this caution for want of a proper 
measure, we must try, by tact or experience, without any real method, to get an estimate 
of the uncertainty. 

Even by the best a priori determinations of probability caution is not superfluous; 
the dice may be false, the pivot of the roulette may'be worn out or bent, and so on. 

§ 72. By the a posteriori ddermination of probabitiiy we build on the law of the 
large numbers, inferring from a law of actual errors in the form of frequency to the law 
of presumptive errors in that of the probability. We repeat the trial or the observation, 
and count the numbers m for the favourable and n for the unfavourable events. 

Owing to the signification of a probability as mean value, the single values being 
0 for every anfovoorable event, 1 fmr every favourable event, the probability p for the fa- 
Tourabls event must be tr^fnred unchanged from the law of actual errors to that of 
presumptive errors; ooniequently 

m 

^ w’-f n ’ 


(130) 



297 


Since, according to the same consideration, the square of the mean de^iation tor 
a single trial is «= the number 6^ of the repetitions is « w+n, 

the square of the mean errors must, according to (47), be 


which is, therefore, the square of the mean error for a single trial, whethei this is one of 
those which we have made, or is a repetition which we are still to make, and for which 
we are to compute the uncertainty. 

If we then ask for the mean error of the probability p «• got from the 
m 4* ft repetitions, we have 

L it,) -= - „ liir V) n321 

^ {m 4- ft)* (m 4“ ft — 1) »i 4^ ft — 1 ^ 

as the square of this mean error. 

The identity 


wn j _ mn _ tnn 

{m 4-ft)* {m 4- «)*(?»ft — I) “ (»i 4“ ft) (jft 4- «— ij 
or 

033) 

shows that the mean error at a single trial, when the probability p is determined a poste¬ 
riori by m 4- ft repetitions, can be computed by (34), as originating in two mutually free 
sources of errors, one of which is the normal uncertainty belonging to the probability, for 
which t* p(i (123), while the other is the inaccuraij of the a posteriori determination, 
for which ^Jjp) is the square of the mean error. 

The a posteriori determination therefore never gives an exact result, but only an 
approximation to the probability. Only when the number o^ repetitions we employ is so 
large that their reduction by a unit may be regarded as insignificant, we can immediately 
employ the probabilities found by means of them as complete expressions for the law of 
errors. But even by the very smallest number of repetitions of the trial,, we not only obtain 
some knowledge of the probability, but also a determination of the mean error, which may 
be useful in predictions, and may serve as a measure of the caution that is neces.*^. It 
must be admitted that it is not such a simple thing to employ these mean errors as those 
in the ideal theory of probability, but it is not at all difficult. 

As above mentioned, the a posteriori determination of probability seems to be purely 
empiric; theory, however, takes part in it, but is concealed in the demand, that all the 
trials we make use of must be repetitions, in the same way as the future trials whose 
results and uncertainty are predicted by the a posteriori probabilities. Transgressions of 
this rule, which reveal themselves by unsuccessful predictions, are by no means rare, and 
compel statistics and the other sciences which work with probabilities, to many alterations 



2$S 

of their theories and hypotheses, and to the division of the materials obtained by trial into 
more and more homogeneons subdivisions. 

Example. A die is inaccurate and suspected of being false. On trial, however, 
we have on throwing it 126 times got “six” exactly 21 times, and so far, all is right. 
The probability of “six” is found, consequently, to be n « -■= i; the square of the 

15 11 ^ 

mean error is ““ 900’ indicated by the mean errors are 

consequentiy ^ J- 

If now we seek the probability that we shall not get “six" in 6 throws, the 
probability is still as by an accurate die + but what is now the 

mean error? Ideally, its square should be (1—J3)®(1 —(1 — jp)*) + But if jp 

can have a small error dp, the consequent error in (1—p)® will be — 6(1—if 
then the square of the mean error of p is *=» =« p (1 — p) the total square of 

the mean error of the probability of not getting “six” in 6 throws will be 

+J.+ =1+ 

9 +■••+ 311 +- 35 +- 

In every single game of this sort the mean error is therefore only slightly larger than with 
an accurate die, but its actual value is so. large (nearly |) as to call for so much caution 
on the part both of the player and of his opponent, that there is not much chance of 
Mr laying a wager. This may be remedied by stipulating for a large number of repeti* 
tions of the game. Let us examine the conditions if we are to play this game of making 
6 throws without “six” 72 times. With the above approximate fractions there will be 
expectation of winning 24 games. In the computation of the square of the 

mean ^or of this result, the first term in the above A, must be multiplied by 72, but 
the second by 72*; hence 

- 16 + 33 - 49. 

The mean error will be about 7, while it would only have been 4, if the die had been 
quite trustworthy. 

§ 73. We have meutioned already, in §66, the skewness of the laws of errors 
which is peculiar to all probability. It does not disappear, of course, in passing from the 
law of actual errors to that of presumptive enors, and in the a posteriori determination of 
probability it produces what we may call the paradox of unanimiip\ if all the repetitions 
ve have made agree in giving the same event, the probability deduced horn this, a poste* 
riori, must not only be 1 or 0, but the square of the mean error IJp) of these determi- 



m 


nations (as well as the higher half-invariants) becomes ««0. Must we infer then^ respectively, 
to certainty or to impossibility, only because a smaller or greater number of repetitions 
mutually agree? must we consider a unanimous agreement as a proof of the absolute 
correctness of that which is thus agreed upon? Of course not; nor can this inference be 
maintained, if we look more closely at the law of errors jUo ^ 0. /i^ «*= 0, ... ^ 0, 

Such a law of errors, to be sure, mat/ signify certainty, but not when, as here, the ratio 
*= CO, A law of errors which is skew in an infinitely high degree, must indicate 
something peculiar, even though the mean error be ever so small. Add to this that it is 
not a strict consequence in practical calculations that, because the square of a number, 
here that of the mean error, is «« 0, the number itself must be»« 0, but only that it 
must be so small that it may be treated as a differential, which otherwise is indeterminate. 
The paradox being thus explained, it follows that no objections against the use of a 
posteriori probabilities in general can be based on it. But it must warn us to be cautious 
in computations with such probabilities as observed values, where the computation, as the 
method of the least squares, presupposes typical laws of errors. For this reason, we must 
for such computations reject all unanimously or nearly unanimously determined probabilities 
as unsnitable material of observation. Another thing is that m must also reject the 
hypothesis or theory of the computation, if it does not explain the unanimity. As an 
example we may take an examination of the probability of marriage at different ages. The 
a posteriori statistics before the c. 20^^ year and after the c. 60^ must not be used in the 
computation of the sought constants of the formula, but the formula can be employed only 
when it has the quality of a functional law of errors so that it approaches asymptotically 
towards 0, both for low and high ages. 

The paradox of unanimity has played rather a considerable part in the history of 
the theory of probabilities. It has even been thought that we ought to compute a poste¬ 
riori probabilities by another formula 

p “ (iM) 

and not, as above, by the formula of the mean number 

m 

The proofe some authors have tried to give of Bayes’s rule are open to serious objections. 
In the “Tidsskrift for Mathematik” (Copenhagen, 1879), Mr. Bing has given a crushing 
criticism of these proofs and their traditional basis, to which I shall refer those of my 
readers who take an interest in the attempts that have been made to deduce the theory 
of probabilities mathematically from certain definitions. 



300 


Bayes’s rule has not been employed in practice to any greater extent, particularly 
not in statistics, though this science ^orks entirely with a posteiiori probability. But as 
it makes the paradox of unanimity disappear in a convenient way, and as, after all, we 
can neither prove nor disprove the exact validity of a formula for the determination of 
an a posteriori probability, any more than we can do so for any transition whatever from 
the law of actual errors to that of presumptive errors, the rule certainly deserves to be 
tested by its consequences in practice before we give it up altogether. The result of such 
a test will be that the hypothesis that Bayes’s rule will give the true probability, can 
never deviate more than at most the amount of the mean error from the result of the 
series of repetition, viz. that m events out of w + n have proved favourable. In order to 
demonstrate this proposition we shall consider a somewhat more general problem. 

If wo assume that trials have been previously made which have given fi favourable, 
V unfavourable events, and that we have now in continuing the trials found m favourable 
and n unfavourable events, then the probability, being looked upon as the mean value, is 
determined by 


p - . (135) 

of which Bayes’s formula is the special case corresponding to /4 v -» 1. Bayes’s rule 
would therefore agree with the general rule, if we knew before the a posteriori determination 
so much of the probability of both cases, as a report of one earlier favourable event and 
one unfav arable event. 

In the more general case the square of the mean error at the single trial is now 


; ^ _ + _ 

* (tM + n + v) (jM -f H +// + V — 1) ’ 

and for the w + » trials is 

(w -1“ 4" ^2* 


If we now compare with this the square of the deviation between the new ob> 
servation and its computed value, that is, between m and + 






?n + »+/i +V 


m-\-nl\m-{-fi « * x.. • (1^) 

It appears at once from the latter formula that the greatest imaginable value of the ratio is 
the greatest of the two numbers and y. In Bayes’s rule ^ «y == 1. Here, therefore, 1 is 
the absolute maximum of the ratio of the square of deviation to that of the mean error. 
With respect to Bayes’s rule the postulated proposition is hereby demonstrated. But at 
the same time it will be seen that we can replace Bayes’s rule by a better one, if there is 



301 


only an a priori determipation, however uncertain, of the probability we are seeking. If 
we take the a priori probabilities m for, and (1 - c5) against, instead of fi and v, so that 


jP 


m + gg 
w + w + r 


(137) 


then We are certain to avoid the paradox of unanimity where it might do harm, without 
deviating so much as the mean error from the observation in the a posteriori 
determination. 

Neither Bayes's rule nor this latter one can be of any great use; but we can always 
employ them, when the found probabilities can be looked upon as definitive results. On 
the other hand, the formula of the mean value may be used in all cases, if we interpret 
the paradox of unanimity correctly. Where the found probabilities are to be subjected to 
adjustment, the latter formula, as I have said, must be employed; nor can the other rules 
be of any help in the cases where observed probabilities have to be rejected on account 
of the skewness of the law of errors. 


XVII. MATHEMATICAL EXPECTATION AND ITS MEAN ERROR. 

§ 74. Whether the theory of probability is employed in games, in insurances, or 
elsewhere, in all cases nearly in which we can speak of a favourable event, the prediction 
of the practical result is won through a computation of the mathematical expectation. 
The gain which a favourable event entails, has a value, and the chance of winning 
it must as a rule be bought by a stake. The question is: How are we to compare 
the value of the latter with that of which the game gives us expectation? Imagine the 
game to be repeated, and the number of repetitions N to become indetinitely large, then 
it is clear, according to the definition of probability, that the sum of the prizes won, if 
each of them is F, must be pNV, when p indicates the probability. The g^n to be 
expected from every single game is consequently pV, and this product of the probability 
and the value of the prize is what we call mathematical expectation. 

The adjective “mathematicar warns us not to consider pT’’ as the real value which 
the possible gain has for a single player. This value, certainly, depends, not only objectively 
on the quantity of good things which form the prize, but also on purely subjective circum¬ 
stances, among others on how much the player previously possesses and requires of the same 
sort of good things. An attempt which has been made to determine by means of what is 
called the ‘‘moral expectation”, whether a game is advantageous or not, must certainly be 
regarded as a failure. For it takes into account the probable change in the logarithm of 

18 



302 


the player’s property, but it does not take into consideration his requirements and other 
subordinate circumstances. We shall not hm tr} to solve this difficulty. 

It is evident, with respect to the mathematical expectation, that if we play several 
unbound games at the same time, the total mathematical expectation is equal to the sum 
of that of the several games. The same is^ the case, if we play a game in which each 
event entitles the player to a special (positive or negative) prize. In this latter case we 
speak of the total mathematical expectation as made up of partial ones. 

Example 1 . We play with a die in such a way that a throw of 1 or 2 or 3 wins 
nothing; a throw of 4 or 5 wins 2 s., and one of 6 wins 8 s. The total mathematical 
expectation is then 5xO + ix2 + Jx 8 *= 2 s. A stake of 2 s. will consequently 
correspond to an even game. We might also deduce the 2 s. throughout, so that a throw 
of 1, or 2, or 3, causes a loss of 2 s. and a throw of 6 a gain of 68 .; the total mathe¬ 
matical expectation then becomes » 0 . 

Example 2. In computations of the various kinds of life-insurances the h^is is 
1) the table of the number of persons l{a) living at a given age a. The probability of 
such a person living x years is =» dying within x years » 

of his dying at the exact age of a + a; years »» and from these all 

other necessary probabilities may be found; 2 ) the rate of intertdt /?, which serves for the 
valuation of future payments of capital, or annuities certain ( 1 ^( 14 .^)-*)!. 

The value of an endowment of capital, F, payable in x years, if the person who 
is now a years old is then alive, is thus equal to the mathematical expectation 


m 


( 1 +,)-. ^ V « V (138) 


which, as we see, is most easily computed by means of a table of the function 


Such a table is of gi'eat use for other purposes also. 

The value of an annuity, tJ, due at the end oi every year through which a 
person now a years old shall live, can be computed as a sum of such payments, or by 


the formula 


Xwmaa 

’I 


i(a+a!) 

~J(a) 


(!+/>)- 


ygreo 

D(a]^ ■ 


Z)(a+«),- 


(139) 


There 1(«)»0 and 2)(a)) — 0. 

But it deswvee to be mentioned that this same mathematical expectation is most 
safely looked upon ae a total mathematical expectation in a game whose events are the varions 
posable years of death; the probability of death in the first year bang , in 



303 


the second s so on; while the corresponding values are annuities certain 

of V for varying duration. In this way we find for the value of the life-annuity the 
expression 

x ^ te 

- j(«4-*+i)) (1 -a+j»n. (140) 

Since the sum i;(Z(a-j-a;) —^(a+aj+l)) «* 1(a), we find by solution of the last paren- 
0 

thesis that the expression may be written 

T(a)j^ 

and this shows that the value of the life-annuity is the difference between the capital sum 
of which the yearly interest is v and the value of a life-insurance of ^ payable at the 
beginning of the year of death. 

In life-insurance computations integrals are often employed with great advantage, 
instead of the sums we have used here; periodical payments (yearly, half-yearly, or quarterly) 
being reduced to continuous payments, and vice versa. 

§ 75. That mathematical expectation is not a solid value, but an uncertain claim, 
is expressed in the law of e^rrors for the mathematical expectation, and particularly in its 
mean error; for, owing to the frequent repetitions and combinations in games and insurances, 
it does not matter much that the isolated laws of errors, here as for the probabilities, are 
often skew. If the value V is given free of error, the square of the mean error of the 
mathematical expectation, is, according to the general rule, to be computed by 

(141) 

If there are N repetitions of the same game we get 

« pNV 

and 

(142) 

and for the total expectation of mutually free games, we have 

(143) 

By free games we may pretty safely understand such as are not settled by the various 
events of the same trial or game, (As to these, see § 76.) 

The mean error is excellently adapted for computing whether we ought to enter 
upon a proposed game, or how highly we are to value uncertain chums or oatstandiag 
balance of accounts. Such things of course are regulated by the boldness or caution of 
the person concerned; but even the most cautious man may under fairly typical circup- 

18* 



304- 


stances be contented with diminishing the value of his mathematical expectation by 4 
times the amount of the mean error, and it would be sheer foolhardiness, if a passionate 
player would venture a stake which exceeded the mathematical expectation by the quadruple 
of its mean error. On the other hand, a simple subtraction or addition of the mean error 
cannot be counted a very strong proof of caution or boldness respectively. 

Example 1. A game is arranged in such a way that the probability of winning 
from the person who keeps the bank is the prize is 8^. In games the mathematical 
expectation with mean error is then (0*8» + 2*4V'w) If the banker has no property, 
but may expect 144 games to be played before the prizes are to be paid, he cannot without 
imprudence estimate his negative mathematical hope, his fear, lower than 0*8 x 144-f 2*4 x 12 
»144 He mpst consequently fix the stake for each game at about one dollar, and will 
thus stand a chance of seeing the bank broken about once in six times. If, however, ho 
has got so much capital or credit, as also so many customers, tjia( he can play about 
2304 games, his business will become very safe; the avera^>e gain of 20 cts. per game is 
460 ^ 80 cts, or exactly 4 times as great as the mean error 2 40 cts. x 48. But who 
will enter upon such a game against the banker (a game, after all, which is not worse 
than so many others)? The very stake is already greater than the mathematical expectation; 
every prudent regard to part of the mean error will only augment the disproportion. 
No prudent man will enter upon such a game, unless he can thereby avoid a greater risk; 
in this way we^ insure our risks, because it is too dangerous to be “one’s own insurer”. 
If the game is arranged in such an entertaining way that we pay 40 cts. for the excitement 
only of taking part in every game, then even rather a cautious person may also con¬ 
tinue for 144 games, the mean error as above) being only 28 g 80 cts. or 

144(0*8—(l*0-0*4)) g. For a poor fellow, who has only one dollar in his pocket, but 
who must for some reason necessarily get 8^, such a game may also be the best resource. 
But if a man owns only 2304 g, and fails if he cannot get 8 times as much, then he 
would be exceedingly foolhardy if he played 2304 times or more in that bank. If we must 
run the risk, we can do no better than venturing everything on one card; if we distribute 
our chances over n repetitions, then we must, beyond the mathematical expectation, hope 
for times that part of the mean error which might help by the one attempt. 

Example 2. Two fire-insurance companies have each insured 10,000 farms for a 
total insurance of £ 10,000,000. The yearly probability of damage by fire is and 
both must every year spend £ 5000 on management. Both have sufficient guaranty-fund 
to rest satisfied with one single mean error as security against a deficit in each fiscal year. 
How high must either fix its annual premium, when there is the difierence that the com¬ 
pany A has 10,000 risks of £1,000, while B has insured; 



3o5 


n 

a 


««' 

100 farms for f 10,000 

f 1,000,000 

10,000 x(l0)« 

400 .. « 

5,000 

" 2,000,000 

10,000 . 

1,500 « « 

2,000 

■ 3,000,000 

6,000 . 

2,500 •> « 

1,000 

. 2,500,000 

2,500 . 

2,000 « « 

500 

• 1,000,000 

500 • 

1,500 « » 

200 

• 300,000 

60 . 

2,000 u „ , 

■ 100 

» 200,000 

20 . 

10,000 farms 


£ 10,000,000 

20,080 X (10)* 


Since p(l^p) «= 0*00009'^, the mathenuitical expetiation ^ its mean error is in the case 
of ^ £ 100,000 ± f in the case of /i « £ 10,000 ± £5,390; the premiums 

are therefore £1 16s. 4d. and £2 7$. tOd. respectively tor £1,000; ic, B must reinsure 
part of its rjisks. ' 

§ 76. The mean eitor and, in general, the law of error, of the t^ital mathematical 
expectation for mutually bound events which may be considered co-ordinate events of the 
same trials, are computed in half-inv'ariants by means of the sums of powers. If the trial 
can have n various events, of which the one whose probability is entails a gain of the 
value 0 ,, and we imagine the same lepeated a sulliciently large number of times (N times), 
the account will show: 

ft I occurring times, 

n« occurring times. 

*= iPi - 1 - • . . 

^ {Pi'U • . . - 

ffjj . -/W')A', 

and the half-invariants for the single trial will be 

^ '' I T - the total mathematical expectation » /f(l,... n) j 

,, n)] ^ .. 4 - p ,(/„2 _ - j _... (144) 

By this formula, therefore, we must in such cases compute the square of the mean 
error of the total mathematical expectation for the single trial. For the square of the 
mean erior of the expectation from N trials we have consequently 

LANJid ...«)) « ... «)*}. ( 145 ) 

By even game we understand a game where the total mathematical expectation 
is 0; the last term of this formula will consequently disappear in such a game. As the 
mean error does not depend on the absolu’te values of the gains or losses, but only on 




306 


their differences, we may in the computation of the squares of the mean errors reduce to 
even game by subtracting the mathematical expectation from all the gains, and adding it 
to the losses. Thus we may write: 

This rule then differs from the rule of unbound games only in the absence of the iactors 

(1-Pih ... 

We can now compute the mean errors in the examples 1 and 2, in § 74. In 
No. 1 we have 

= iM)Hi(0)*+i(6)* «8. 

In the life-annuity example we now see the advantage of using the longer formula 
(140) for the value of the annuity, rather than the formula (139) which gives the value as 
the sum of a number of endowments; for the partial expectations are hare not unbound, 
and only the deaths in the several years of age exclude one another and can be considered 
co-ordinate events in the same game. For the square of the mean error of the life-annuity 
we have, from (144): 

jr«pa» 

§ 77. In the above studies on the mean errors of mathematical expectations we 
have supposed that the probabilities we use are free from error, being either determined 
a priori by good theory or found a posteriori from very large numbers of repetitions. This, 
determination is not complete in the cases in which the probabilities determined a posteriori 
are found only by small numbers of trials, or if probabilities computed a priori presup^se 
Tjdues observed with sensibly large mean errors. The same warning must be taken with 
respect to other values which may enter into the computed mathematical expectations; the 
value of the gains, for instance, may depend on the future rate of interest. Whether some 
of the manifold sources of errors are to be omitted in a computation ol the mean error, 
or not, must for each special case depend on the relatiie smallness of thei parts of the 
total ji,. As to the theory of probability it is characteristic only that the parts of the square$ 
of the mean errors, considered in §§ 75 and 76, are, as a rule, very important, while the 
analogous parts in other problems are often insignificant. When the orbit of a planet is 
computed by the method of the least squares, then, in order to restrict the limits of 
research for its next discovery, we have to compute the mean errors of its co-ordinates 



at the ne\t opposition Ordinarily these mean errors are so laige that the ;> for itb future 
observations may be wholly omitted, though this u ij> analogous to those from 75 and 
7b But when we have computed a table of moitalitv bv the method of the least squares, 
w( can eertamly find bj that method tin mean error \^) Jp) of the probability of 
life computed fiom the table, but if we are to predict an\thing as to the uncertainty 
with regard to n lives, and with regard to the corresponding mathematical expectation npa^ 
then we must not, unless h is veiv gieat, take the mem eiioi as ua]^^ (jj), but we 
must, as a rule, first take jlo(H) m (oiisideration, and consequenth use the formula 
e\amph 4172) 






CORRECTION FQR THE MOMENTS OF A 
FREQUENCY DISTRIBUTION IN 
TWO VARIABLES^ 


By 


William Dowell Baten 
University of Michigan 


In certain stati$tical problems it is beneficial to divide the 
ijiven data into classes or groups and investigate the distribution 
in this form. The moments determined for the distribution div¬ 
ided into classes differ from the moments determined from the 
original data. It is the object of this article to show how to 
modify the former to secure the latter for a frequency distribu¬ 
tion in two variables. 

After the data, given for a frequency distribution of one 
variable, have been divided into classes the class mark is then 
the representive of the items in a class. This is assuming that 
the mean of the items falling }n a class is equal to the class mark. 
For a large number of items in a class, distributed throughout 
the entire class, the class mark differs very little from the average 
of the items in the clas.s. But the average of the items raised 
to a power is not equal to the class mark raised to the same 
power. Hence corrections should be made to the moments de¬ 
termined from a distribution which is divided into classes. 

For a distribution of two variables jc and y the data are 
divided into xy -classes, where the class mark of an jc -class 


1 Presented to the American Mathematical Society, Sept. 12, 1930. 




310 


CORRECTIONS FOR THE MOMENTS 





W. D. BATEN 


311 


is considered to be the representive of the items falling in this 
class, while the class mark of a y -class is the representive of 
all items in this particular class. The coordinates of the point 
in the xy -plane, whose abscissa is the class mark of the jc -class 
and whose ordinate is the class mark of the y -class, may be 
considered to be the class mark of the double class or the jcy - 
class. 

Let the frequencies of the distribution be represented by 
the volumes of the volume-compartments as shown in the figure. 
The sum of all such compartments is the total of the frequencies 
and should be equal to the number of items in the distribution. 
The little solid /VkVQ/-SPlPOis the frequency of the items 
falling in the Sthj::-class and in the 3rd y -class. OT and 0/^ 
are the class marks of this x-class and this y -class. (orrcoF)"^ 
multiplied by the frequency of the items falling in this double 
jry-class may differ considerably from the svim(OC)^(OK)^ 
*(OA)”COG) . . . , hence corrections must be made to 

the moments obtained from the distribution divided into classes 
where the double class marks are the representives of the items 
in the class. If the class units are made smaller and are allowed 
to become very near to zero the errors are not so large, for it 
inust be remembered that our results are only approximations. 

By definition the r? *th, 777'th moment of the distribution 
which is divided into classes is 


^j9:377 



+h^yj+k')dh dk. 


whvre ( , 4/* ) is considered to be the class mark of the 

4 , j -class, andfthe double summation extends over all the classes. 
It is further assumed that f ^ ^ func¬ 

tion which can be expanded into a Taylor series. The above 
becomes 



312 


CORRECTIONS FOR THE MOMENTS 



^ Now Hse the EiUer-Madaurin Simunation*.formula for two 

for fiuda* the value of this double eummatioo. Thh 

formula is 


*This formula is developed on pages 317-319, 



fF. D, BATBN 


313 


LZuktj>j J(Jk</)dzdy~y 0(x,^)dy]^ tfk,y)dx 



h^UOx^y) 

cli-L 

l>*l « 
b^UCx,y) 

c/W 

a 

b^U(x,y) 



14-4 dx by. 

c ^ 

^ I4i0 dar^. 

C 

^ I440dy^ - 

c ■' 

a 


which is the double summation of the function U {x^y) from 
a to A on the jr-axis and from c to d along the y-axis. 
Applying this formula to the double summation above 



4/^‘*L Sdx'* a 5dy^ 



314 


CORRECTIONS FOR THE MOMENTS 



d^Hx.y) 

(f)d^-f(x.g) ^ 

(iid^fUM) , b^fCxy) 

F - 1 

6/2® 

- ^ 

_7da® 

^■3dx‘*-dy^ 

3Sdx^dy^ 7 dy^ J 

+ • • • 

/ 

d^f(x,y)f 


(4)b^f(x,y) 

4- - 

5/2® 

. (s+l)dz^ 

(5-£-^lX2^l)dx^ 

■^dy^ *(s-4*l)(4+l) dx^-^dy^ 



(t^d^f(x,y) 

(s-t 

■^*l)(6+l)dx^-^6y^ 

(s-ti-0(t+0 djc®"' dy * 


4 


(" 5 +/) dy^ 


^dxdy 4 O + O’^ 


t is an even number* In obtaining this result it was assumed 
that f(x,g),f‘{x,y), x^y'^f(x,y), (x,y) 

vanish or become negligible at the limits on the x an^ y axes, 
k and w are positive integers. 

Therefore 


\)L« K.-m 



A 1 //? 
5 


«K, 


4:m 




6!2^ 




7 ' 6^^/7.-/77-<6 




( (a^(s-2)(VK-s:Srm-e 


(s-2HX2^/) 






IV. D. BATEN 


315 


(S~4*/)(3) 777-4 

+ ly.. 

(s-l)^ ^77.- 777-sj 

If 777=0 the formula becomes the formula for obtaining the 
moments about a fixed origin for one variable. This has been 
dope by Sheppard and Carver. 

If n and rrt take on int^al values 




» 




Sj'i- 

^ I. 

~ M 

rz /-o 

* 


B'Z 

*l/u’ 

m ^0.2 

fJU^ 

/a 

. 

0 144' 



Ku 

* i- LU' 

* A ^/■/' 

v;.3 

^:.-3 * iK: 

/ * 

v;,*- 

K.z 

ju' + -kM' 

A^h'S /Z^^'O 




Ka 

•/- ~ iJ 

/a ^03 

A^Z- 

, *A6 ^o.w 

H 




4-~ JLl' 

■4 ^3-1 

■^TeKr 


<:r 

■M' 

*iK„* 

J-M' j 

80 ^O.’l ^ 

^'u-Ka ^ 

oM' 
z ^r-z 


^ i 4.-.0 *Z4 ^ A:0 "^360 ' 




316 CORRECTIONS FOR THE MOMENTS 

^0-4 K-Z '*'60 K:0 ^24 ^O'Z ^ ^ ^ 

'^•i' ^4:3*'2^S:3 *4^4.1 '*‘^^0.3 '*'3^2:1 '*260^0:1 ' 

^3:2^3i4-'*"k^3;2 ^^3:0^6^HZ *‘i60^l-0 

%-.4^4-.4 *k ^2:4 ^4:2 '*'80 ^0:4 '* 4 ^2:2 '*'§0 ^4:0 


4- ^ IJ.^ + — ^ +■ *' '~ • 

/60 ^2:0 160 ^o-z 6400 


From the above the yW. ’s can be obtained. 




• Ka- 'i^s 'i ' 


^.• 3 '^ 3 ! 3 " 4 ' 4 .'/ 4^113 *Js''^l:r 


etc. 




W. D. BATEN 


317 


By translating the origin to ( , My) 

Jb (‘^:Z ^ ’ 

i / ~'\^> If ' Mas ~ ^1:3 " '4--/ ’ 

m '^••0 ' 

^2;3"Vs.-J ”2 V!^^.•^ ~72 V^0;3 ' 

^3,3%-3'4\),:3~4%n » 

etc. 

In making corrections for the double momenta it rnuot L)e 
remembered ta correct the single moments of the oc's and the y 's.* 

Euler-Maclaurin Summation for Two Variables 

Suppose it is possible to find a function g such 

that g(o:+t, y'*’l)-s(x+l,y)-g(x,y^-0 +g(x,y) = 
f(x.y), g(jc,y) = f (jc,y) or 'f(x.y) 

, where A represents finite dif^erehce and 
represents finite integration. 11 g {jc, y) is such a function, 
then 

g(a.+l,c+l)-g(ct+l, c)-g(a. ,c-fO*9{a- , , c), 

g(a+Z^c-t-i)-g(Q.+2, c:)~g(a. + l,c + l)*g(a*/] c)^f(a.+l, c), 

,c-¥-2)+g{a ,c+0—f(a ,c*0, 

g(a*2,c-t-^g(a+£, c+/)-g(o.+l, c-t^)->-gCQ*/, c+O = ffa+f, c+l\ 


♦See Frequency Curves by H. C. Carver in Handbook of Math. Statistics. 



318 


CORRECTIONS FOR THE MOMENTS 


g(b,d) -c^b ,ddhg(b-l, dVg(h^l,d-l)^f(b l,c 0, 
g(bi-/. d+i)~g(b^ d) 'g( b , dd)>-y( b, a) ^ dj. 

MA\g(M,dd)-g(bd, c)-g(d ,dd)^g(d,c) 


Or 


c a 


>«( b"^! 


a 


If it is^^p^sible to find the timcliou g tlien the 

double sum £ iC cau be touml, Expand g(cic ^ I, yhO 

in a Taylor series. 

dx dy 


9(x^-i.y*i) ^ ^ ^ 5y ^ ’ir 


'■ 5 / 


h% Sb1y 

y. f . _— _ ^ y 


dx!^ dxdy^ dy 


« (e 4 g{jc,t/) -- (e y), 

where 

represent respeotiveSv 


Q / \ d" 

^ 9 ('^>y)> ~^h 9 (Jc.y), 9(^.9) 


dy' 

Hence 

*9('^,y) 

'(J 

■ffe ~ 

Wheijs ^ O's are operators operating on tlie function a (at. u\ 
Therefore ‘ ^ 

where tte opeiators are now oiuiaiiug upon ui. huHtum 





W. D, BATEN 


319 


To develop VL /) Into a Taylor series it 

ct 

IS necessary to develop (e *^^i)( 'e /’J ^ Taylor 

series and then divide by u v. This becomes after ^ replace 
u and V respectively, 

'24~24~710$ l44~720Bi440 /44d 
TflZ 6i^ J 


e> 

^6142^ 


where ;g*, ^ represent integration. 

LJ ^ ^ b4l 

Using these results Z] S 9(^» y) or 

C a J c ct 


ef-tl b't'/ 


d4-i 


J J, Of/ /fr. Uf. 0*1 S^CUI 

C ® c « c \!r cu 

b+l -^4*/ <t*t -\b*i, 


.d4>Lb+i 


_ d*l M 


d-f^l 


, <y^4 -I dt! 




, a fM 1 ]_d_ Jf(xu)<ix 

* I44^dxd^^ J ^ 720dyy^ ' 

d^f(sc,y) 1 d^fCjc,y) 

^ 14 40 dx^\^ 1440 hij 


'c 

cC-*‘^'\ b-hl 





THE STANDARD ERROR OF A MULTIPLE 
REGRESSION EQUATIONi 


By 

John Rice Miner 


Since a multiple regression equation is essentially a hyper- 
plane, fitted by the method of least squares, its standard error 
may be obtained from Gauss’ standard Qrror of a function recently 
discussed by Schultz (1930). Let the equation be 


oc ^ h cc -t- b 


cc 

• m3 / m. 23 • 


^3 


’ ‘^the in- 

177 


where is the dependent variable, 
dependent variables, each measured from its respective mean, and 

3 * . . - 77 ?’ • ■ 777.33 -/) 

sion coefficients. Then the determinant of Schultz’s equation 
(10) becomes 


( 1 ) 


n 

0 

O ' 

- ■ 0 

1 

0 

0 



2 m 

0 





i 

0 

o • 

O 


o 

/ 

^23 



Q 


1 

^3m 


O 

^2m 

^3m 

' • / 





m 




m 


iFrom the Department of Biology of the School of Hygiene and Public 
Health of the Johns Hopkins University. 





/. R. MINER 


321 


Let A be the cofaetor of the element in the i ’th row 
and the j ’th column. Then 

[///^ * /»<Tjh , 

[c./9]= D,jD^O, 


[A7l=4a/^=A,3Acr^<r^A , 
[/3m] = ^s»,/0=A <r^ A, 


hf^, hi ^ df ^ df ^ 

—— as/ —- 3c^_, —— as . » • »*!'■' " > ='3kr 

M 38 ^ &C ^ 


n-m > 1 ' ^i(s3... yn) ' ’ 


Therefore, substituting these values in Schultz’s equation 
(27.1), we have 







iZ2 ERROR OF A MULTIPLE REGRESSION EQUATION 



, as 

+2 - -T a-- 




O' 





For a simple regression equation this reduces to 


( 3 ) 




<5. 


rn-e)i 


(/- 



This agrees with the expression given by Pearson (1^13), 
if we remember that is measured from its mean and that 
Pearson does not correct for the number of parameters. 

For a regression equation with two independent variables 








w 




or/ 




o-ZO-rJ) 


Hi 






(l~r^ 




cr f xf Sr_x_x 

= —/-gg //+ —g- + -±-~ a 3 

(r7.3)i I 5e*3 <^5.3 a-gs 

As an example of the application of this formula we may 
calculate the standard error of the mean heart-weight (X ,) of 
the array of persons with an age (X^ of 52.92 years and a 




J. R. MINLR 


323 


body-weight (X$) of 49,93 kilograms in a population of 213 
persons characterized by the following biometric constants: 



348.9 g; 

or, -79.4 g; 

r ,2 -4.0.114 


59.65 yrs.; 

0 ^ -17.54 yrs.; 

r,g =4-0.652 


56.45 kg; 

0^-14.38 kg; 

r^3=-0.18S 


From these data 3 0.315 and 2 “ 0.689 

and the regression equation of heart-weight on age and body- 
weight is 

X, =« 66,03^iJ00X^^3.648X. 

/ ^ a 

from which the mean heart-weight of persons aged 52.92 years 
and weighing 49.93 kg, is 316.4 g. 

Substituting the appropriate values of the constants in (4) 
and remembering that 
6.52 , and 


a/oi 


{0-993} 


f 


(-6.72)' 




(/7.54f {0.966} (/4.38}^ (0.966) 


2(~0.i25)(-6.72)C-6.52) l^ ^ 
07.54.} (J4.3e) (0960 j 




REFERENCES 

Pearson, Karl. 1913. On the probable errors of frequency constants. 
Biometrika, 9:1-10. 

Schultz, Henry. 1930, The standard error of a forecast from a curve. 
J. Amer. Stat. Assoc, 25T39-185. 



SAMPLING IN THE CASE OF CORRELATED 


OBSERVATIONS 


By 

Cecil C. Craig 

National Research Fellow , ^ 


Dr. E. C. Rhodes, in a paper in the Journal of the Royal 
Statistical Society has considered the distribution of character- 
istics of samples of N when the individual observations are not 
assumed to be independent. As he points out, there are many 
important cases in which the usual assumption of independence 
or randomness in the observations is not justifiable. In the pres¬ 
ent paper will be explained a method based on the semi-invariants 
of Thiele for the calculation of the characteristics of the sought 
ciistributions in this case which is especially to be preferred to the 
method based ,on moments when it is supposed that the observa¬ 
tions arc normally correlated. In the case it is further assumed 
that only consecutive observations are correlated, in addition to 
Dr. Rhodes’ results, the third semi-invariant (which is the same 
as the third moment about the mean) of the variance and the 

mean and the variance of the third and fourth moments about 
the mean are given. 

Let the N observations composing a sample be given by val¬ 
ues of JT,, . . .cr^respectivdy and let ■ ..-x ) 

be the n -way probability function oi , and x* 


iThe Preci-!ion of Means and Standard Deviations 
Errors Are Correlated, Vol 90 (1927), pp. 135 - 143 . 


When the Individual 



C C. CRAIG 


325 


Then the semi-invariants, ' * of , “ * , 

defined by 



«. OO, . . . , - oo 


which is to be regarded as a formal identity in 

is first expanded by the multinomial law and 
then each term "-in the result is replaced by 

We shall pass over the characteristics of distributions of 
means, since the method of semi-invariants is equivalent to that 
of moments in this case, and take up the distribution of moments 
about the mean in samples of N . Following the method pre¬ 
viously used by the author in the case of independent observa¬ 
tions,^ let 


( 2 ) 





*2 with 



Then let ( S,. 6^. * • * be the probability function of 

6, 6^.,,(2 5^ = 0. The semi-invariants **’ 

of ‘ ' *1 are defined by 

^Following Cramer, I distinguish between probability and frequency func¬ 
tions. is the ‘‘cumulative^’ frequency function 

and thus the integral is an W-way Stieltjes integral. 

2An Application of Thiele’s Semi-invariants to the Sampling Problem; 
Metron, Vol. 7, No. 4 (1928), pp. 3-75. 



( 3 ) 


SAMPLING IN CORRELATED OBSERVATIONS 

e * 



-•«> * - -c^ 


We have at once, 


r 


j-i 


^ij 



/N^t iV f \ 


and as the author has previously remarked,^ we can also write 


f ^ ^ \C^J / ^ I . \ 


. # 

so long as the relation is only used to find the values of ^ ri>f ' 
in which at least one of the subscripts is zero. 

Then 5^ ("^r?) , the Xr 'th semi-invariant of the /7 'Ih 
moment about the me^n in samples of /V, is given by the formula 


5^ )- 


. < - V) ' J - 

- ^ - 

[A,/4^-j [cy/c^/---] rJs/t!-'- 


Uoc. cit, pp. IS, 19. 



c c rR iu 


327 


the notation leferring to moments of 8,, • ■ 

the suminatltm im hiding all terms for which 

r(a,+o.^ + - -■•)*t (c^-i-Cg *■■■ )* ■ •s./r, 


Q, & <2^ = 


A ^ 




In particular; 

~ 7j ^ ^n,o * ^^n,o~ '^n,o o '^<5 o '4<Ji40 

'S. (.^n ) - T/*-* ,,.o ^ <»,o <o /] . 


5) 






On writing out the moments ^ vv terms of the semi- 

mvariantwS and then using (4) the sought semi-invariants 

are obtained. 

In the ease that the N observations are normally correlated 
and (oc^ * , ’ *' / is the N -dimensional normal prob¬ 

ability function, the left-hand member of (4) vanishes for k t Z. 

If we suppose that the standard deviations of 
are aU equal (which we shall always do) and take as the simplest 
case that ^ nuimilly correlated diiu diat 

^Sec the author's paper cited, p 21, formula (25). 

^For a detailed explanation of this kind of calculation see the amhor*s paper 
cited, pp. 23-27. 



328 


SAMPLING IN CGRRBLATEI) OBSERVATIONS 


the correlation as measured by the Pearsonian coefficient, , 

is the same for each pair, , jcj , of the set of A/ observations, 
wc get 


* ^f>zo ~ ^cozo 


X =x =/ - 

/to 10/0 otto 


~ 7S ~ If 


if the common value of r_ be denoted simply by r , But 

^ J ^ ^ 

if the observations are independent and the parent population is 
normal we have 


AL = A' , = A' 


20 


020 


X =A' =A' ^ 

/fO fOiO etto 


A/-/ 

N 


A 


20 > 



20 • 


Thus it follows that the distributions of the characteristics 
of samples of A/ in this particular case of dependent observa¬ 
tions are the same as if the observations were indcfKindent and 
taken from a normal population of variance (/- r) ' 

In case normal it is convenient to 

express the right hand members of (5) directly in terms of the 
semi-invariants for 77«2, 3, 4, For that purpose we 

shall adopt the following notation. Let the linear form ^ A; 

. J*t V J 

be denoted by . Then (4) becomes 


( 6 ) 




Thus in a symbolic sense A^’s and X-’s are equivalent. But with 
regard to the subsenpts of the A terms in the expansion of the 
left member of (6) we use a different convention than for the 
subscripts of the A’s. We set 


’See the author, loc. cit., ji. 19 . 




C, C. CRAIG 


329 




^20 
, / 


^t2 » ^iO/C ” 


‘U:Ay]Uh 

the summations, of course, running over all values of t and / 
from / to A/ , But since 

i:A^u^2r.A^,Ajj-(EA,y 

the second relation reduces to 

Similarly 

5,rv4)=^ ^Au*3ZA,,Atj *6ZAij A,, Aj, ), 

^Lk^jk ^k-i ^ 

5 ,( 03 ). 0 . 

S,(03)-5'a(5r4^6Z:/\-,4^; A,j-^4ZAtj ). 





330 


SAMPLING IN CORRELATED OBSERVATIONS 


3, W- "jp («/<?, ^ii At At). 


To illustrate the use of these formulas and to give some 
results in a case of practical interest, let us suppose that the set 
of N observations composing a sample may be assigned an order 
in which only consecutive observations arc correlated and in a 
constant degree. Thus our observations might be i)rices or indices 
taken at the ends of consecutive time intervals. We suppose, then, 
that 


^OliO "" ^ 004*0 


= r A- 


^40/ ” ^4 00/ 


• » O . 


The first stej) in the calculation is to obtain the values of the 
variDUS A T which enter into the formulas ( 7 ). A,, is found 

fruiii Af, from A^ A^and so on. We get 

~ ^ ) ^SO > 

A A A fB ^ ^ \ \ 


A >*A - ^ ) A 

W /yf fsfiL ^ ^ 


( 8 ) 


ZO f 

L±Ar 2 r 


■ ■ • « A. 


A =4 


7, A/-/ 4 




A / \ \ 




I < k< fJ-t 
!<}< 

/t-y/>/ 



C. C. CRAIG 


331 


Then, on substitution in (7), we have finally 


These two results are given by Dr, Rhodes, loc. cit., though 
there is a slight misprint in the second one as given there. The 
remainder of the results giver^ here are believed to be new. 











)]A 


4 - 

so » 




f±L .{li. ^ 

~ /V* ''J ' 



C. C CRAIO 


d3X 


It should be observed that the expression^ for 5^(\J^) for 
W < 3 and for k ^ 2 iot A/< 5 are in general uot valid, 

since it can be seen by reference to (8) that all the tyjn > of A ^s 
used in the formulas (7) do not exist for values t)f N so small. 
But for these small values of N , the values of the characteristics 
for which expressions are given above can be readily computed 
directly. 







Stanford University. 



THE RELATION BETWEEN THE MEANS 
AND VARIANCES, MEANS SQUARED AND 
VARIANCES IN SAMPLES FROM COMBINA¬ 
TIONS OF NORMAL POPULATIONS 


By 

G. A Baker 


The distributions of the means and variances of samples 
from the combinations of normal populations have been discussed 
in a previous paper,^ It is known that if the sampled population 
is not normal the means and variances of samples are not 
independent. 

The present discussion aims to give some idea of the relation 
between the means and the variances, means squared and vari¬ 
ances of samples from a population that is the combination of 
normal populations. To this end the case of samples of two from 
such populations is rather completely investigated. Also empir¬ 
ical random sampling results for two special populations are pre¬ 
sented. 

Suppose that a population is represented by 


(1) f(och 




vrr~ 


a-mr 




'"Random Sampling from Non-Homoganeous Populations,” Metron, Vol. 
VIII, No. 3 (1930). pp. 1-21. 



334 RELATION PhTWEEN MEANS AND VARIANCES 


If a method used by Karl Pearson^ is followed, the prob¬ 
ability of 

oc, in doc, is f(x,)dcc, , 
ar^in c^*ais f(cc^^dxg 

and the piobability of the concurrence of these two events is 


( 2 ) f(x,)f (xg )d dxg^ 

which may be written 


/ 


O-f-kfan 




( 3 ) 

,£ (e-iW-^*1}] 


Now 

Whence 

{ X, ■= -L *x 
^ oc • 

Also c/dc, dx^ may be replaced^ by 

1 Appendix to Papers by “Student** and R. A. Fisher, Biometrika, VoL XIX 
(1925), p. 522. 

2R A. Fisher: “Frequency Distribution of the Values of the Correlation 
Coefficient in Samples from an Indefinitely Large Population/* Biometrika, 
Vol. X (1915), p. 507. 



G. A. BAKBR 


335 


(5) c <afZ) dx 

la nrtue of (4) and (3), (6) is obtained. 

(6) *W[^ 


This is the correlation surface for the means and standard 
deviations oi samples of two drawn from (1), To get the corre¬ 
lation surface of the means and variances write 




U 


dZ 


du 

zm 


Then 




_e_ 

Z'fu 


i [<? ^ w/] 




CO 


K ftOW^ +xf* (-ru+oc-m) ^ 




1 ] 



336 RELATION BETWEEN MEANS AND VARIANCES 


is the desired surface. 

The locus of mean u’s for given ce’s is 


( 8 ) 


U 


e k- '-7^ J 

>rJ\A 


-OC^ 






The locus of the meanjc's for given u ’s is 


{9)cr 


mke 
<r 








j t j 

e(fS’- . ef-fu- 

' <r^V/ I 


. “SF^TTj 


The correlation surface for the means squared ( - x ) and 
vanances is 

lit/ ) — ^ * 

<><» {fes 



G. A. BAKER 


337 


The locus of the mean a ’s for given z *s is 

-t , i 4-fSk 

in)u 






The locus of the mean a's for given U ’s is 


( 12 ) 


k^e cr^ 


e + 


[- 




+ \<r^* 


j 

sjfC'’ 


multiplied by 

^fsk 








By expanding the denominators of (8), (9), C'^l), and (12) 
by the multinomial theorem, it can be shown that each of these 
loci is essentially parabolic, a^t ^ 0. They are subject to an 
exponential influence at the beginning of the range of the inde¬ 
pendent variable, which influence rapidly diminishes as the inde¬ 
pendent variable takes on higher values. 

The pro-bability relations in general between means and vari¬ 
ances, means squared and variances will be expected to approx¬ 
imate those for the case of samples of two, because of the fol- 



RELATION BETWEEN MEANS AND VARIANCES 


iowing considerations. Suppose that n (the number in the 
sample) is large.^ When a large proportion of the sample comes 
from the first component, the first term of (7) with 2 in the 
numerator of the exponent replaced by 77 and with u ^ replaced 
by u will be an approximation to the surface of the means 
and variances. Similarly, when a large proportion of the sample 
cpmes from the second component, the second term of (7) with 2 
in the numerator of the exponent replaced by 77 and with u ^ 
replaced by oc will be an approximation to the surface of 
the means and variances. When about equal proportions of the 
sample come from each component, the last term of (7) with 2 
in the numerator^of each exponent replaced by ^ and with a « 
replaced by u will be an approximation to the surface of 
the means and variances. Or, all together, (7) with the 
mentioned changes in the exponents of the terms, with proper 
weighting of the terms, and with u ^ replaced by u ^ is a 
proportionate approximation to the distribution of the means and 
variances of samples drawn frbm a population represented by 

(I) . Further, increasing 77 will not influence relations (8), (9), 

(II) , and (12) as approximations for the general case except 
the exponential term, if it is assumed that the denominators are 
expanded apd then multiplied by the numerators, for pf occurs to 
the same power in the numerators and denominators. 

^Note: The effect of /f and of the binomial coefficients is roughly as fol¬ 
lows. If the 77 ¥-1 terms denoting s from the first component of (1) 
and 77 -s from the second component are divided into thirds, then, if 
I are the exponents of /f in the middle terms, 

or approximately, since is large and since only a proportionate expression 
is desired 9 ^ ^ -^ 3 *^ 

or tlw exponents of k of the middle terms of the diree sections above 
are times the exponents of k in (7). The effect of increasing Tj be¬ 
cause of the binomial coefficients is to weight the middle section of the 
possible surfaces to a much greater extent than the extreme sections, 
that with r» very large the last term of (7) with 2 replaced by rt 
becomes an approximation to the desired surface. 



G. A. BAKER 


339 


From (8), (0), (11), and (12) it is clear that the par¬ 
ameters of the sampled ix>pulation have great influence on the 
regression relations considered. It should be borne in mind in 
this connection that many flattened and skewed, as well as bi- 
modal, distiihutions can l>e adc^iuately represented by combina¬ 
tions of normal populations. Also, results (8), (9), (11), and 
(12) can l)e extended to the sums and differences of any number 
of normal curves, subject to the condition that the resultant is 
always positive. 

In 192S, Dr. Neyman^ gave the correlation coefficient be¬ 
tween the deviations of the means of samples from the mean of 
the sampled population and the variances of these samples for 
samples of n drawn at random from an infinite uni-variate popu¬ 
lation in terms of the betas of the sampled population as 


( 13 ) . 

Similarly, the correlation coefficient between the deviations 
squared of the means of samples from the mean of the sampled 
population and the variances is 

( 14 ) ^JACilr ,. ■ 


Under cercain very special conditions the statement oi 
and may give an adequate idea of the regression relation 
between the means and variances, means squared and variances 
of samples from a population represented by (1). In general 
the mere statement of these coefficients will not give any useful 

Splawa-Neyman ‘‘Contributions to the Theory of Small Samples 
Drawn from a Finite Population,” Biometrika, Vol XVII (1925), pp. 
472479 



340 RELATION BEl WEEN MEANS AND VARIANCES 


notion of the actual probability relations* This is true because: 
(a) the regression relations between means and variances, means 
squared and variances of samples from a population represented 
by (1) are essentially parabolic, as shown for samples of two 
and as seems probable for larger samples, (b) the frequency 
arrays may vary markedly in dispersion, in skewness, and in 
other characteristics. 

To illustrate these remarks, samples of four were drawn 
from two special populations by throwing dice. 

Suppose that a population is represented b> 


:is) fu)= 


/ 


r 




The first four moments of f {X) about its mean are 

a = -o 

' i+k 


u. 


l+k 


ay. - 




3+6Tnf+ k(3</^+ 6 ) 

t+k 



G. A. BAKER 

CHART A 



























Correlation Table Showing the Relation Between the Means and 
V'ariances of Samples of Four from Population_l_ 


342 RELATION BETWEEN MEANS AND VARIANCES 



xn 

H 

THCS10sOsvoro»Oco-^vooOM>WW)r>»‘^ 

C<I'O0si-'VO00<Oi-h'O'^»H 

^4 



240 

to 

255 


1-4 


225 

to 

240 

1-H 

M 


lag 

CSJ »H rH 

B 


195 

to 

210 

4 1-4 1-4 f-4 

3 


O lO 

OO Q 0\ 
w-1 44 tH 

1-^ i"H tH 1-4 

fl 

2 

1 

2 

xn ^ o 

2 s ss 

CO <0 VO M 1-4 

1-4 

O ^ VO 

VO P so 

rH CSJ CO <N. «0 »0 

C4 

VO ^ O 

S 

04 CM VO G? tN. 1-4 ^ 
iH 


S sis 

T-I rH 

i-4^oooorx'0'«i’Csj 

? 

VO 0 O 

V-4 y-i 

tNtNO-^OsfOOVOCvj 

1-1 1H ^ 1-4 1-4 


o O S 

CA 44 ;r; 

Cs^V01-^voCM^1.1-|^sC0^*•^ 

1-4 CO N CQ »-4 

1-4 

75 

to 

90 

cvjC'jr-NcavorrooooosOa 

1-4 1-4 CSl t-4 iH 

o 

▼*4 

S sK 

^o»oo*-4vooa'«fO\ iMiH 

T-« Cm CsJ ^ 

136 

45 

to 

60 

CsaT|-VOCsiO^OO'«t'4:fOOCOCO 

CM 1—4 1-4 CO iH 1—4 

VO 

CO 

1—4 

O ^ VO 
CO 3 '?!■ 

1-4 l-400SOf^>OOtS,l^OOC<lOsCOi-4iH 

1-4 1-4 1-4 1-4 1-4 1-4 

CO 

1-4 

1-4 

VO ^ O 

1-4 3 

1-l1-^OC5VOCOCOOO'T^C^^^^1VOSO'01H1-4 

1-4 1-4 1—4 -H *-4 

(M 

1-4 

o 0 ^ 

44 1-4 

rHVOCOi-4'lt-C^rj-VOlOC^'O—4CO'^i-< 

1-4 1-4 1-4 1-4 

00 

*-4*-4COVOtN.O\iHCOjO 

^^loco»-^a^^^voco 1 1 1 1 i 1 1 1 

r-«r-4w-i»-i OOOOOOOOO 

00000000-^44^^44 44 44+444 

4444444444444444^^^^j^Q^^,^y,fs^ 

lOCOr-iOstxvocOi-i, 

*-4i-ir-4 tllllllil 

Totals 

1 Sdi4iut>^ fo 




















a A, BAKER 




Whence 


(16) 


' [/v- » 7 ,^ k (tr^+ 


^ \j+m,^-hk(a^+7n^ ')^ 


1‘hus, for any special population of the form (IS), ^^and 
/O ' can be easily calculated. 

Samples of four were drawn from a population apprcot- 
imately represented by 


(18) 



-^(x-1.7) 


1 


The actual sampled population is shown in Chart A and is 
hereinafter called Population I. 

Table I shows the distribution of 1038 samples of four drawn 
from Population I with respect to the observed values of the 
means and the variances. The arrays for constant values of the 
variances are at first distinctly bimodal, gradually becoming uni- 
modal Chart I shows the means of arrays of Table I with the 
regression lines as calculated without correction for groupings. 
It is apparent that the locus of the mean variances for a given 
value of the means diverges a great deal from a straight line. 
This regression relation looks as though it was a normal curve, 



344 RELATION BETWEEN MEANS AND VARIANCES 


which IS what would be expected from (8) with CT^- / « 0. The 
theoretical and actual correlation coefficients for this and three 
subsequent tables are compared in Table V and the constants of 
the marginal distributions of Tables I to IV are presented in 
Table VI. 

If the deviations of the means of the samples of Table I 
from the mean of Population I are squared, Table II results. 
Chart II shows the means of arrays and regression lines of Table 
11. The regression lines are very poor fits to the means of the 
arrays which are, apparently, exponential loci. 

Table III shows the distribution of 1058 samples of four 
drawn from a population approximately represented by 


(19) 





rr ^ *3(Fn ® 


-^(x-z.4)* 


with respect to the observed values of the means and variances 
of the samples. The actual sampled population is presented in 
Chart B and is hereinafter called Population IL Chart III shows 
the means of arrays and regression lines of Table III.'This chart 
resembles Chart I in that the locus of the mean variances for given 
values of the means is so obviously non-linear. Also, a glance at 
Table III is sufficient to see that the arrays vary markedly in 
skewness. 

Table IV shows the relation between the m^ns squared and 
variances of samples of four from Population II. Chart IV shows 
the means of arrays and regression lines for Table IV. In this 
case the regression relations seem to be fairly near linear, and the 
frequency distributions of the arrays do not change strikingly. 



G. A. BAKER 


345 


CHART I 

The Means of Arrays and Regression Lines of the Means and 
Variances of Samples from Population I 




TABLE II 

Correlation Table Showing the Relation Between the Means 
Squared and Variances of Samples of Four from Population I 


346 


RELATION BETWEEN MEANS AND VARIANCES 



cn 

h 

o Th VO o ^'■H V5 eg ^ 'O egto r^cneg rHi*4 

S to o 00 to CO Tf eg ^ 

^ T-t t-H 

I 

rH 


O to 

Tt- o to 

eg -M c>j 

rH 



to o 

eg ' 

eg 


O to 

o <N 
eg 4S eg 

CO 

i 


C3\ O T—i 

»-H eg 

eg eg 

B 


O - to 

ss S 2 

coeg ^ ^ 

B 


!S o§ 

OTT 

rH 



o ^ lo 
to 0 VO 

»-t 

CnJOO ^ 

r—i 

rH 

eg 

to o o 
CO 5 to 

»-t tH 

22 

3 

3 

28 


O ^ to 

eg fi CO 

r-4 rH 

to VO eg eg CO eg 
eg 



105 

to 

120 

56 

15 

9 

5 

5 

4 

2 

2 

Ov 

s 

gsi 

corfrHvoeg toegi-* n^rn 

OC rH rH 

126 

75 

to 

90 

a3ostoos-!f egrHeg 

lo rH rH 

no 

60 

to 

75 

ov Tj* VO tN, to to VO rH eg *h 

to eg rH rH 

136 

45 

to 

60 

;joo^egvocg'^rg egeg rn 

to 

CO 

rH 

O o 

CO 

O CO CO rH VO VO CO rH eg rH rH rH eg rH 

rH rH rH rH 

ro 

r-4 

t»4 CO 

^ eg ^ eg rH eg tn 

eg 

rH 

o a T—t 

Jn to ^ tN* o VO to to re CO toeg th eg 

87 

Oio 10 
10 to 20 
20 to 30 
30to 40 
40 to SO 
50 to 60 
60 to 70 
70 to 80 
80 to 90 
90 to 100 

100 to 110 

110 to 120 

120 to 130 

130 to 140 

140 to ISO 

ISO to 160 

160 to 170 

170 to 180 

ISO to 190 

190 to 200 

220 to 230 

250 to 260 

JT) 


s2j(lmD^ fo pauvnb^ suvjjr 






















G A BAKER 


W 


CHART II 

The Means of Arrays and Regression Lines of the Means 
Squared and Variances of Samples trcm Population I 



'TOTE The la$t thirteen class intervals of the means squared are grouped 
into one group 



348 RELATION BETWEEN MEANS AND VARIANCES 


CHART B 


* 


A 




















TABLE III 

Correlation Table Showing the Relation Between the Means and 
Variances of Samples of Four from Population II 


G. A. B4KER 

^iSSg^OOCSJ^tNOOVJOOCO 



S sS 


||S3S 


s stQ 



S o lo 




i-«rsocMrs.<o'^oo 

»-t ca *-< i-H 


tTi 

VO 

>o 


OV 

VO 


O 

CM 

*>H 




CM 

*—1 

r-4 


rH 






00 

CM 



to 

rv 






CM 

CM 

Cm 

»-4 



T -1 r-H CJ CO 

o 

CO 

CM 

r-H 



rH 


o 




*-( 

CM 

CM 

CM 

CM 



cs 

CO 

CO 


CO 


VO 

)S 


c% ( 






CM 

CO 

CO 

C-) 


T-4 CM 

CO 

WO 

••r 

a\ 

M- 

»o 

a\ 

\o 

M* CO »“< \ 






CM 

M* 

to 

CO 

"" i 


N. ir> CO I"* lO CO ^ cr» CO lO 

^ »~t T-i r<-» 

.I'm 

2S22S2S22SS3 2222S13 

tOPO*-'CNl'>.lO<0<r-<*-<COlON^O\—^COlOr«K 3 







































350 RELATION BETWEEN MEANS AND VARIANCES 


CHART HI 

The Means of Arrays and Regression Lines of the Means and 
Variances of Samples from Population II 





Correlation Table Showing the Relation Between the Me 
Squared and V'ariances of Samples of Four from Population 



















































352 RELATION BETWEEN MEANS AND VARIANCES 


CHART IV 

The Means of Arrays and Regression Lines of the Means Squared 
and Variances of Samples from Population II 



£4C 


£7C 






G. A, BAKER 


353 


TABLE V 


Correlation Coefficients of Tables I-IV 



Correlation-Coefficient 

Number of Table 

Theoretical 

Actual^ 

I 

.00 

-.05 

ii 

-.34 

-.37 

III 

.40 

.37 

IV 

-.07 

-.05 


TABLE VI 

Constants of the Marginal Distributions of Tables I-IV 
in Terms of Class Intervals 


Marginal Distribution 

M^an 

Standard 

Deviation 

Means of Samples from Population I 

.2522 

2.467 

Variances of Samples from Population I 

4.8908 

2.900 

IMeans Squared of Samples from Populs^tion I 

3.5918 

3.203 

Means of Samples from Population II 

.07 

2.237 

\^ariances of Samples from Population II 

3.5708 

2.854 

Means Squared of Samples from Population II 

1.4088 

1.744 


’Calculated without corrections for grouping. 

2Is so far from zero because of the groupings employed. Many means were 
exactly odd integers. These were all put forward into higher classes, mak- 
iiift the calculated mean too large 
•‘^Origin taken at the beginning of the range, 























354 RELATION BETWEEN MEANS AND VARIANCES 

From the results for the case of samples of two and from 
*the results of empirical sampling, it seems clear that the simplest 
regression relation that is generally applicable to the means and 
variance;s, means squared and variances, of samples from popu¬ 
lations which are the combinations of normal populations is para¬ 
bolic. For small samples and for certain values of the parameters 
of the sampled population the regression relations may involve 
exponential terms that are quite important. As the sitt of the 
samples increas'es, it is expected that this exponential term will 
decrease in influence. It seems plausible that even with large 
samples the regression relation of means and variances, means 
squared and variances will remain essentially parabolic. It is not 
expected that the determination of a good approximation to the 
regression relations will serve to give an adequate notion of the 
probability relations of the means and variances, means squared 
and variances of samples from a population represented by (1), 
because the arrays may vary in number of modes, in skewness, 
in dispersion, and in other characteristics. For instance, surface 
(7) may be trimodal so that arrays may be bimodal or unimodal, 
and in such a case the arrays ‘must vary markedly. vSu^faces 
(7) and (10) with 2 replaced by t? and with the terms suitStbly 
weighted are valuable approximations to the probability relations 
of the means and variances, means squared and variances of 
samples drawn from a population represented hy (1). 


A~- S 



A TABLE TO FACILITATE THE FITTING OF 
CERTAIN LOGISTIC CURVES 


By 

Joshua L. Bailey, Jfe. 


Tie most useful generalization of the logistic curve is that 
haling the form 

(1) i/’" g a y- - . . 

Ift practice it will seldom be found necessary to use higher 
powers of X . This equation may also be written 

(2) Y « <i + bx + cx^+gx'^ 

in which Y 5 log . 

If we can evaluate the constant k with reasonable accuracy, 
the value of Y corresponding to each observed value of g can 
be computed, and then the values of the coefficients a, bj c, and 
g , in equation (1) may be obtained by fitting equation (2) as 
a generalized parabola by the method of least squares. 

The normal equations necessary to make this fit will be found 


to be 





a Sue’ 

+ 

bt. <x 

+ cZx^ 

*z:y 

tt.'Zx 



*■ oZx^ 

+ g Z x^ = Z!ocY 


•h 


^ cZ jc'* 

+ gEx^ * Z CC^Y 

a Z 



+ c 3C jc ^ 

+ 92:00® =Ca^Y. 



356 A TABLE FOR LOGISTIC CURVES 

In the special case where the observations have been made 
at regular intervals (that is, where the successive valpes of jc are 
in arithmetic progression) the solution of these normal equations 
may be greatly simplified. We may then select an arbitrary origin 
in the middle of the range of observations, so that for every 
positive value of jc there will be a corresponding negative value 
of e^tal-absolute magnitude. Thus the sums of the odd powers 
of JC will all be zero. 

If the number of observations be odd, the middle one will, 
of course, be chosen for the origin, and the unit of the scale will 
be the interval between successive values of j: . If the number 
of observations be even, the origin will be midway between the 
middle pair of observations, and it will be found more convenient 
to talre half the interval as scale unit. In the former case, jc 
will take all integral values between + n and - n , while in the 
latter case or may take only the odd integral values. 

If we set the sums of the odd powers of jc in the normal 
equations equal to zero, and solve them simultaneously, we de¬ 
rive the following formulae for the literal coefficients: 

EX*zx^-(ZX*y ’ zx‘zx*~czx*/ ‘ 

The use of capital letters indicates that the equation has been re¬ 
ferred to the arbitrary origin. 

In these formulae the factors involving Y must be com¬ 
puted from the observations, but those in which X alone occurs 
may be tabulated for all convenient values of r), Since Y does 
not occur in the denominators at all, these may be tabulated in 
the same way. 



/, L. BAILEY, JR. 


357 


csiM o-^poqoqoTrt-qoooqOThpoqoqO'^ooqoq 

•!• »-i^Mco^u^-^t^C^OCN3^vooOOC'4LO 


vO" US' cK nT Ch o 00 *' Q ^ 

f>^ ^ S 

\0 tN. O On ON fv 
CO Nc vd'^oCoo 
1 -H in ts. 0\ 

1-1 i 



»N<N 

I u^ p^oO Nj^tv O^NO u^u-^O n On On pro p 

oK Cvf On NO t-H t-h ^C'--^Titus'rC'^ •^t-h p c4'rtr-7 

C^^vOco^^^^•Ti■coOOp'«i'^XpOpr-^ 

, »—< CNJ 00 Osj^ ^ CO 

tfy »-H co" -^rsT p oC VO 

, . tH r—< rM Cn 


Q92SS^Qooooc^QooQQCvioco< 
T-t in tx p^p vqpoo pPoN u^in 00 P'^oo^v 
V.H On '*^'^0 conOvoOnth in irT O uS* o^» 

'^covCONU^u^Q'txCvlOCvoC40N»-H» 

^ <n 00 p p P 1 -H ppIs. 00 c 


1 ^’ CO ts CO CO 00*' c5 cvj O ' 

rH CVJ CO ^ C> T-H < 


'* rH ^s ON in CO U^ ' 


poQcsi:i'OvQopp^:5j*:si:^ooQcacNi 
^i?5in'^vQ)S^CNrl-tsMpcoONCOco 
ppcopppp^pcopp^s.pcoco 
^ tFoCcCq QpN^oduSvorC'Ti^'«^inin 
r-i<ointsiOatstninoou^vQCMTi- 

T—< rH (SI CO VO 


•VX '^283S2S3SigRS8g§8ggg2§? 


T-ii^i-HCNlCMCMcOTt-T^-m 


ox cou^^sO^»-lcOln^sONl-^plnt^sGNT-HCOln^^aNt-(COlOtso^-rH 
'-HT-^'*--<T^t--(C^<M<MCsJCS|cococococO'^Tl"’^Tt*rt’m 


r-ic^cort-mvot^OOONO^CvicO'cj-u^votxOOONO^CNJcOrhu^ 

*—H t-H t—I I »-H T—4 r—I r-H »—I i-H CM CN| CM CV| CM CM 


6.622 1,834,294 604.443.862 35,023,758 637,992,775,728 277.0 

7,590 2,302,806 -831,203,670 46,018,170 1.005,920,381,664 303.4 

8,648 2.862,488 1,127,275,448 . 59,749.032 1,554,840,524,160 331.0 

9,800 3,526,040 1,509,481,400 76.735,960 2,359,959,638,400 359.8 

11,050 4,307,290 1,997.762,650 97.569.290 3.522.530,138.400 389.8 
























TABLE TO BE USED WHEN THE NUMBER OF OBSERVATIONS IS EVEN 


358 


A TABLE }0R LOGISTIC CUKt/hS 


1* 

Orv5CMO'OOCvl040VOOCSICqpvqpCvjCMOvqOCVi<NOvo 

od d t-x od to vd evi CO 00 CA ^ qj 00 CO esj ;p eg p 00 op 
0‘lcotnoC^^Ocoop'^Q§co^C^t>^!£?l£'5S2fe^ 

r— ^ »** 

(\i^ 

X 

1 

^IX 

o' 

X 

0 

2,304 

290,304 

6,386,688 

65,235,456 

424,030,464 

2,038.772,736 

7,894,388,736 

25,960,393,728 

75,123,949,824 

196,144.058,880 

470,584,857,600 

1,051,840,857,600 

2,213,790,808,320 

4,424,337,967,104 

8,453,141,250,048 

15,525,242,320,896 

27,535,076,464,896 

47,338,548,401,664 

79,144,486,327,296 

129,030,886,752,768 

205,615,957,434,624 

320,919,084,186,624 

491,452,517,928,960 

739,590,829,286,400 

eg 

1 

OX 

0 

256 

3,584 

21,504 

84,480 

256,256 

652,288 

1.462.272 
2,976,768 
5,617,920 

9.974.272 
16,839,680 

27.256.320 
42,561,792 

64.440.320 
94,978,048 

136,722,432 

192,745,728 

266,712,576 

362,951,680 

486,531,584 

643,340,544 

840,170,496 

1,084,805,120 

1,386,112,000 

x:7 

9 

2 

1,460 

32,710 

268,008 

1,330,890 

4,874,012 

14,527,630 

37,308,880 

85,584,018 

179.675.780 

351,208,022 

647,279,800 

1,135,561,050 

1,910,402,028 

3,100,048,670 

4.875,056/132 

7,457,991,970 

11,134,523,220 

16,265,976,038 

23,303,463,560 

32,803,672,042 

45,446,398,140 

62,053,929,390 

83,612,360,048 

111,294,934,450 


2 

164 

1,414 

6,216 

19,338 

48,620 

105,742 

206,992 

374,034 

634,676 

1,023.638 

1,583,320 

2,364,570 

3,427,452 

4,842,014 

6,689,056 

9,060,898 

12,062,148 

15,810,470 

20,437,352 

26,088,874 

32,926,476 

41.127,726 

50,887,088 

62,416,690 

w 

^ t-T of CO rC od'cT co rf drn 

ox 

C^3'^VOOOOC^'^^OOOC^JT^MDCOOO^ri•'OOOQ^^^OOO 

^^^^,-^OaCVJOi<NCvirOco^coco^Ti*-^'5**^i?i 

f: 

r-Icol-o^^.0^^cotn^vO^’-4cotn^s.O^»-(fo'^^>^<^^^'£?^s.o^ 



J. L. BAILBY, JR. 


359 


Finally, the sign of G is determined by the direction in which 

the curve approaches the asymptote ^ = 0, and this may readily 

be told by inspection, iiut it not mtrequentiy Happens that a 

slight error in one of the observations may be sufficient to give 

6 the wrong sign. In this case the limits between which the 

observations were taken must be changed, or a new value of k 

must be tried, or the faulty observation must be adjusted by a 

smoothing formula. It is obviously important therefore that some 

means be provided for determining the sign of G before the 

values of the coefficients are determined. , ^ 

E)(j ZIX 

The condition that G shall be negative is ^ ^ y 
The second term in this inequality may be tabulated in the same 
way. The accompanying tables show the values of the functions 

zxf Zixt 

zx‘z:x^-fz:xYand z:x%z:x^ 

for all values of rt from 0 to 25 when the number of observations 
is odd and from 0 to 49 when they are even. 

In the preparation of these tables, my thanks are due to the 
Zoological Society of San Diego for the use of the facilities 
afforded by its research department. 


f. 





THE GENERALIZATION OF 
STUDENT’S RATIO’^ 


By 

Harold Hotelunc 


The accuracy.of an estimate of a normally distributed quan¬ 
tity AS judged by^ reference to its variance, or rather, to an 
estimate of the variance based on the available sample. In 1908 
“Student” examined the ratio of the mean to the standard devia¬ 
tion of a sample,^ The distribution at which he arrived was 
obtained in a more rigorous manner in 1925 by R. A, Fisher,^ 
who at the same time showed how to extend the application of 
the distribution beyond the problem of the significance of means, 
which had been its original object, and applied it to examine 
regression coefficients and other quantities olitained by least 
squares, testing not only the deviation of a statistic from a hypo¬ 
thetical value but also the difference between two statistics. 

Let ^ be any linear function of normally and independently 
distributed observations of equal variance, and let s be the es¬ 
timate of the standard error of ^ derived by the method of 
maximum‘likelihood. If we let t be the ratio to S of the devia¬ 
tion of ^ from its mathematical expectation, Fisher's result is 
that the probability that t lies between t, and is 


♦Presented at the meeting of the American Mathematical Society at Berk¬ 
eley, April 11, 1931. 
iBiometrika, vol. 6 (1908), p. 1. 

^Applications of Student’s Distribution, Metron, vol. 5 (1925), p. 90. 



H. HOTELLING 


361 



where n is the number of degrees of freedom involved in the 
estimate s . 

It is easy to see how this result may be extended to cases in 
which the variances of the observations are not equal but have 
known ratios and in which, instead of independence among the 
observations, we have a known system of intercorrelations. In¬ 
deed, we have only to replace the observations by a set of linear 
functions of them which are independently distributed with equal 
variance. By way of further extension beyond the cases dis¬ 
cussed by Fisher, it may be remarked that the estimate of vari¬ 
ance 6^ may be based on a body of data not involved in the 
calculation of ^ • Thus the accuracy of a physical measurement 
may be estimated by means of the dispersion among similar 
measurements on a different quantity. 

A generalization of quite a different order is needed to test 
the simultaneous deviations of several quantities. This problem 
was raised by Karl Pearson in connection with the determination 
whether two groups of individuals do or do not belong to the 
same race, measurements of a number of organs or characters 
having been obtained for all the individuals. Several “coefficients 
of racial likeness^’ have been suggested by Pearson and by 
V. Romanovsky with a view to such biological uses. Romanovsky 
has made a careful study^ of the sampling distributions, assuming 
in each case that the variates are independently and normally 

^V. Romanovsky, On the criteria that two given samples belong to the same 
normal population (on tl^e different coefficients of racial likeness), Metron, 
vol. 7 (1928), no. 3, pp. 3-46; K. Pearson, On the coefficient of racial 
likeness, Biometrika, vol. 18 (1926), pp. 105-118. 



362 


GENERALIZATION OF STUDENTS RATIO 


distributed. One of Romanovsky's most important results is 
the exact sampling distribution of L , a constant multiple of the 
sum of the squares of the values of t for the different variates. 
This distribution function is given by a somewhat complex in¬ 
finite series. For large samples and numerous variates it slowly 
approximates to the normal form; for 500 individuals, Roman- 
ovsky considers that an adequate approach to normality requires 
that no fewer than 62 characters be measured in each individual. 
When it is remembered that all these characters must be entirely 
independent, and that it is usually hard to find as many as three 
independent characters, the difficulties in application will be ap¬ 
parent. To avoid these troubles, RomaneFvsky proposes a new 
coefficient of racial likeness, H , the average of the ratios of 
variances in the two samples for the several characters. He ob¬ 
tains the exact distribution of H , again as an infinite series, 
though it approaches normality more rapidly than the distribution 
of JL , But H does not satisfy the need for a comparison between 
magnitudes of characters, since it concerns only their variabilities. 

Joint comparisons of correlated variates, and variates of un¬ 
known correlations and standard deviations, are required not only 
for biologic purposes, but in a great variety of subjects. The 
eclipse and comparison star plates used in testing the Einstein 
deflection of light show deviations in right ascension and in declin¬ 
ation ; an exact calculation of probability combining the two leasts 
square solutions is desirable. The comparison of the prices of a 
list of commodities at two times, with a view to discovering 
whether the changes are more than can reasonably be ascribed 
to ordinary fluctuation, is a problem dealt with only very crudely 
by means of index numbers, and is one of many examples of the 
need for such a coefficient as is now proposed. We shall gener¬ 
alize Student’s distribution to take account of such cases 

We consider p variates JC, , JC^ , . . ., oTp , each of 
which is measured for N individuals, and denote by the 

value of for the cC th individual. Taking first the problem 



H. HOrBLLlNG 


363 


of the significance of the deviations from a hypothetical set of 
mean values rr)/, nip , we calculate the means 

3c, 2 . Xpt'ii the samples, and put 

Then the mean values of the will all be zero, and the vari¬ 
ances and covariances will be the same as for the corresponding 
aci , since the individuals are supposed chosen independently from 
an infinite population.^ In order to estimate them with the help 
of the deviations 

from the respective means, we call n = N ~ 1 the number of 
degrees of freedom and take as the estimates of the variances and 
covariances, 

N 

‘^oe- 

We next put: 


^// 





' ‘ ■ 

a.p/ 

<Zp5 • • 

■ iSpp 


^ *M^an Value” is used in the sense of mathematical e:;pectation; the 
variance of a qianlity wiiobe mean value is zero is defined as the expecta¬ 
tion of its squares; the covariance of two such quantities is the expectation 
of <heir product. Thus the correlation of the two in a hypothetical infinite 
population is the ratio of their covariance to the geometric mean of the 
variances. 




364 


GENERALIZATION OS STUDENT’S RATIO 


( 3 ) 




<; of act or of d,, tn a 

___ it _ 

d 


The measure of simultaneous deviations which we shall em¬ 
ploy IS 


( 4 ) T -EL Aij , 

For a single variate it is natural to take * t/d^f ; then 
T reduces to t , the ordinary “critical ratio” of a deviation in a 
mean to its estimated standard error, a ratio which has “Student's 
distribution/’ (1), For examining the deviations from zero of 
two variates x and y , 






^ ^ ^rx9 

e ~ ^ ^ *,.2 


s; 




where 


Z(X-sf 

“ ■ N-/ ’ 


s, ■ z(r-Qf _ 

z N./ 


Z(X-x)(Y-u) 

<jz(x-3cfz:(Y-yr 


For comparing the means of two samples, one of and 
the other of 7\^ individuals, we distinguish symbols pertaining 
to the second sample by primes, and write 



H. HOTELLING 


365 



73 - 2 , 

■ N ^ < •!,!1 

and take as our ‘^coefficients of racial likeness” the value (4) of 
7*^, in which the are calculated from (5) and the Aij 
from (6) and (3). 

Other situations to which the measure 7* of simultaneous 
deviations can be applied include comparisons of regression co¬ 
efficients and slopes of lines of secular trend, comparisons which 
for single variates have been explained by R. A. Fisher.^ In 
each case we deal for each variate with a linear function of 
the observed values, such that the sum of the squares of the co¬ 
efficients is unity, so that the variance is the sam^ as for a single 
observation, and such that the expectation of is, on the hy¬ 
pothesis to be tested, zero. Deviations of the observations 

from means, or from trend lines or other ^uch estimates, are 
used to provide the estimated variances and covariances 
by (2). The number of degrees of freedom n is the difference 
between the number TV of individuals and the number q of 
independent linear relations which must be satisfied by the quan- 


iMetron, loc. cit., and Statistical Methods for Research Workers, Oliver 
and Boyd, third edition (1928), 



366 


GENERALIZATION OF STUDENTS RATIO 


tities ^ iXf^2 * • • •> ^/A/ account of their method 
of derivation. For all the variates, these relations and n must 
be the same. 

The general procedure is to set up what may be called normal 
values 3c^ ^ for the respective , putting 

( 7 ) 

The underlying assumption is that is composed of two 

parts, of which one, , is normally and independently dis¬ 
tributed about zero with variance which is the same for 

all the observations on cc ^. The other component is determined 
by the time, place, or other circumstances of the oo'th observation 
in some regular manner, the same for all the variates. Denot¬ 
ing this part by we have 

specifically, we take to be a linear function, with known 

coefficients , of q unknown parameters 

where q < : 

( 8 ) ~Z 9^S • 

s=l 

Thus in dealing with a secular trend representable by a poly¬ 
nomial in the time, we may take the gjs as powers of the time- 
variable, the 5’s as the coefficients. For differences of means, 
the g ’s are O’s and Ts, and the C, ’s the true means. 

We estimate the ^ 's by minimizing 



H, HOTELLING 


367 


Substituting from (8), differentiating with respect to , and 

replacing by for the minimizing value, we obtain: 

( 10 ) £ 

or by (7), 

<^ 11 ) L 9^s '^i< 

<?t «/ 

Denoting also the minimizing vjjlues of by , ue 

have made from (8), 

Subtracting (8), 

( 12 ) 9^s(^cs-^i^) 

S«/ 

From (9), 

2v^i: )] 




N 






The middle term, by (12), equals 
/v 9 




( 13 ) 



368 


GENERALIZATION OF STUDENT'S RATIO 


this, by (10), is zero. Hence, by (7) and (13), 

wher^ 

^ «<,=/ 

2 VJi =Z ( 

c(f = i 


If the equations (10) be solved for , jfy2 » • • •» 

f values of these quantities will be found to be liomo- 
geneous linear functions of the observations V*y (7), 

therefore, the quantities 

'^4/ ^ ... 

are homogeneous linear functions of the ^. But they are not 
linearly independent functions, since they are connected by tlv; 
q relations (11), Hence V is a quadratic form of rank 

/ 7 » q 

Since , by (9), is of rank W is of rank q. 

This shows that Np new quantities , given by equa¬ 

tions of the form 




(14) 




can be found such that 



H. HOTELLING 


369 


( 15 ) 




N 

eist/ 




N 

-E 

di/»i 


vT', 




N 


2V/, 

^ «♦« 77-h/ 


JC 


6*>i, * 


and theiefore 


(16) 


N 

2U -Z 




Substituting (14) in (15) and equating like coefficients, 
(17) 2L c, ^ c ^ ^ ^ 


n 

5/ 




where Kronecker delt^, equal to 1 if Z' , to 

The coefficients de^^end only on the , which 

have been assumed to be the same for all the p variates. Thub 
(14) may be written 






Multiplying by (14), summing with respect to ^ from 1 to /7, 
and using (17), 


n . n N H 

21t IoL ^tuL ^ c - O ^ ^LA ^f y 

<^w/ -ir-yx^«/jfiy ^9 



( 18 ) 



370 


generalization of students ratio 


Just as in (2), we define in this generalized case by 


/V 


(19) 

Then by (18), 

( 20 ) 




d.': * — ^ lot, * 

V n J** 


Of the last equation, (6) is a special case. 

The random parts S of the observations on oci have 
by hypothesis the distribution 

rui/2cf 


^A/ 


dtif dSi2, * ■ ' ‘ d6 


lN » 


(ai^y 

where is given by (9), From what has been shown, it is 
clear that this may be transformed into 






dz^r'dx!^. 


showing that are normally and independently 

distributed with equal variance 

The statistic must be independent of the quant ittes 

’ ’’ nic^u value 

must be zero, and its variance must be . These conditions ar(' 
satisfied in the cases which have been mentioned, and are satis¬ 
fied in general if is a linear homogeneous function of 

ccl with the sum of the squares of the 

coefficients equal to unity. 

The measure of simultaneous discrepancy is 



H. HOTELLING 


371 


A,; being defined by (3) on the basis of (19). It is evident 
that 


( 21 ) 


.0 



4 * 

• ■ t, 
/ 

O 





^tp 






- - 


' * 

• - 

. . , 





^pz 

. . . 

PP 




. . . . 







^Zp 




^pz • • 

'^pp 



as api^'ears wheti the numerator is expanded by the first row, and 
the resulting determinants by their first columns. ^ 

A most important property of T is that it is an absolute 
invariant under all homogeneous linear transformations of the 
variates . This may be seen most simply by 

tensor analysis; for is covariant of the first order and A 
is qontravariant of the second order. 

The invariance of T shows that in seeking its sampling 
distribution we may, without loss of generality, assume that the 
variates or. , . . have, in the normal population, zero 

correlations and equal variances for they may always by a linear 
transformation be replaced by such variates. 

Let us now take 

hi > / *• ‘ ' ' / < 77 





372 GENERALIZATION OF STUDENTS RATIO 

as rectangular coordinates of a point in space of t?-/-/ 

dimensions. Since these quantities are normally and in<lei>endeiitly 
distributed with equal variance about zero, the probal)i]ity density 
for fi has spherical symmetry about the origin. Indefinite rep¬ 
etition of the sampling would result in a globular cluster of rep¬ 
resentative points for each variate. Actually the sample in hand 
fixes the points » which may be regarded 

as taken independently. 

We shall now show that T’ is a function of the angle 0 
between the ^ -axis and the flat space containing the points 
P/ 9 ^2 > . . and the origin 0. We shall denote by A. 

the point on the ^ -axis of coordinates 1, 0, 0, . . .,0, and by 

the flat space containing the remaining axes. Since in 
one equation specifies and ^ 1 - yo equations Vp , the 
intersection of and *1^ is specified by all these /7 -f 2 -ya 
equations, and is therefore of /?- 1 dimensions. Call it Vp-/ . 

If Pj , , . , •, be moved about in & will 

not change, and neither will T', since T is invariant under linear 
transformations, equivalent to such motions of the , Hence 
T always has the value which it takes if all the lines OP , OP , 

• • perpendicular, with the last yc? -1 of these lines 

lying in Vp-/ . In this case the angle AOP^ ujuals Q . 
Applying to the coordinates of A and of the fonntila for 
the cosine of an angle at the origin of lines to ( oc^ , . . .) 

and (, . . .), namely, 

(22) cos e = 


We obtain 



cos 6 



H. HOTELLING 


373 


2 Z 

Since orj -h • ■ • 

it follows that 

(23) n cot "^6 = ^//^// 

The fact that ^ ^ lie in V^./ , and there¬ 

fore in , sihows that in this case 

Because OP, , 0P„ , . . OPL are mutually perpen- 

dicular, (20) and (22) show that a^ 0 whenever t j . 
Hence, by (21) and (23), 

(24) ^ CQt&. . 

By this result the problem of the sampling distribution of 
T is reduced to that of the angle 6 between a line OA in 
and the flat space containing p other lines drawn inde¬ 
pendently through the origin. The distribution will be unaffected 
if we suppose fixed and OA drawn at random, with spher¬ 
ical symmetry for the points A } Let us then, abandoning the 
coordinates hitherto used, take new axes of rectangular coordin- 

‘ ya-h! * p ^ • 

A unit hypersphere about 0 is defined in terms of the general- 


iThis geometrical interpretation of shows its affinity with the multiple 
correlation coefficient, whose interpretation as the cosine of an angle of 
a random line with a enabled R. A. Fisher to obtain its exact 
distribution (Phil. Trans., vol. 213B, 1924, p. 91; and Proc, Roy. Soc., 
vol. 121 A, 1928, p. 654). The omitted steps in Fisher’s argument may 
be supplied with the help of generalized polar coordinates as in the text. 
Other examples of the use of these coordinates in statistics have been 
given by the author in The Distribution of Correlation, Ratios Calculated 
from Random Data, Proc. Nat. Acad. Sci., vol. 11 (1925), p. 657, and 
in The Physical State of Protoplasm, Koninklijke Akademie van Weten- 
schappen te Amsterdam, verhandlingen, vol. 25 (1928), no. 5, pp 28-31, 



374 


generalization of STUDENl 'S RATIO 


ized latitude-longitude parameters if we put 



= sin 0! sin 0^ sin 0^ • 

•■SLn0p., cos 0p 

Bz 

=cos 0^siri0g ■ 

■ ■SLn0p,iCOS0p 


s cos0^ sin0^ ■ 

' ■sin0p.i cos0p 


cos0^- 

'■sLn0p.i cos0p 


cos (l>p 

SLn<l>pCos0p^f 

y„ = $Ln(j>pStn^p^r'-cos.0n 

yn*r > 

for the sum of the squares is unity. Since 

V/ • ■ ■ • yL ' 

we have 

(f)p « Q ^ 

The element of probability is proportional to the (*leTn<Mit tjcMt- 
eralized area, which is given by 

4Dd()>, d (j)^ . d(f>^, 

where V is an /? -rowed determinant in which the element in 
the i th row and j th column is 


yp 

yp’t-r 






H. HOTELLING 


375 


For i , this is zero. Of the diagonal elements, the first 
pA contain the factor cos^^p; the pth. is unity; and the re¬ 
maining n-p elements contain the factor sin^^p. Since 0 
is not otherwise involved, the element of area is the product of 


cos^''(pp sin "'^<Pp d^p 

by factors independent of . The distribution function of 
O is obtained by replacing <pp by Q and integrating with 
respect to the other parameters. Since Q lies between 0 and 
we divide by the integral between these limits and obtain 
for the frequency element, 


2r(^) 


cos^ ^9 sin ^ ^9 d 9, 


Substituting from (24) we have as the distribution of T : 


er (^} _ 7 ’^~‘dT 

r{ %) r(dizf±L) n % JJTTTJW 


For p^ \ this reduces to the foun of Student^s distribution gi\en 
by Fisher and tabulated in the issue of Metron cited; however, 
as T may be negative as well as positive in this case, Fisher 
omits the factui 2. 

For /? = 2 the distribution becomes 


IL± 

n 


TdT 


From tins it is easy to calculate as the probability that a given 
value of T will be exceeded by chance 


F- 


0 * 


( 26 ) 



376 


GENERALIZATION OP STUDENTS RATIO 


a very convenient expression^ 

The probability integral for higher values of p may be cal¬ 
culated in various ways, the most direct being successive integra¬ 
tion by parts, giving a series of terms analogous to (26) to 
which, if ^ is odd, is added an integral which may l)e evalu¬ 
ated with the help of the tables of Student's distribution. If p 
is large, this process is laborious; but other methods are available. 

The probability integral is reduced to the incomplete beta 
function if we put 

cc^(l+ 

for then the integral of (25) from 5^ to infinity liecomes 


the notation being 




0 


IJm) 


Q(p.q) 


Many methods of calculation have been discussed by H. E. Soper^ 
and by V. Romanovsky.^ An extensive table of the mcomplcto 
beta function being prepared under the supervision of Profes«Jor 
Karl Pearson has not yet been published. 

Perhaps the most generally useful method now available is 


^Tracts for Computers, no. 7 (1921). 

*On certain expansions in series of polynomials of incomplete B-functions 
(in English), Recueil Math, de la Soc. de Moscow, vol. 33 (1926), pp. 
207-229 



H. HOTELLING 


377 


to make the substitution 

2 = ^ tog^ (ri-p+l) T^- I log^ np, 

77,^ p 

^2' Tl-p + l, 


reducing (25) to a form considered by Fisher. Table VI in his 
book, Statistical Methods for Research Workers, gives the values 
of z which will l)e exceeded by chance in S per cent and in 1 per 
cent of cases. If the value of z obtained from the data is greater 
than that in Fisher’s table, the indication is that the deviations 
measured are real. 

If the variances and covariances are known*a ])riori, they are 
to be used instead of the a ij ; the resulting expression T has 
the well known distribution of \ , with p degrees of freedom. 
For very large samples tlie estimates of the covariances from the 
sample are sufficiently accurate to permit the use of the \ dis¬ 
tribution for T, This is well shown by (25), in which, as n 
increases, the factor involving T approaches 


T 




- T^/Z 

dT, 


which is proportional to the frequency element for ^ when 
is put for T . 

As Pearson pointed out, the lal^or of calculating \, whi:h 
we replace by T , is prohibitive when forty or fifty characters 
are measured on each individual. With two, three, or four char¬ 
acters, however, the labor is very moderate, and the results far 
more accurate than any attainable with the Pearson coefficient. 
The great advantage of using T is the simplicity of its distribu¬ 
tion, with its complete independence of any correlations among 
the variates which may exist in the population. 

To means of a single Variate it is customary to attach a 



378 generalization OF STUDENT’S RATIO 

“probable error,” with the assumption that the difference between 
the true and calculated values is almost certainly less than a cer¬ 
tain multiple of the probable error. A more precise way to fol¬ 
low out this assymption would be to adopt some definite level of 
probability, say jp = .05, of a greater discrepancy, and to deter¬ 
mine from a table of Student’s distribution the corresponding 
value of t , which will depend on r? ; adding and subtracting 
the product of this value of t by the estimated standard error 
would give upper and lower limits between which the true values 
may with the given degree of confidence be said to lie. With T 
an eicactly analogous procedure may be followed, resulting in the 
determination of an ellipse or ellipsoid centered at the point , 
|^> • • •» Confidence corresponding to the adopted prob¬ 
ability P may then be placed in the proposition that the set of 
true values is represented by a point within this boundary. 

WrxMefjdL 



SYSTEMS OF POLYNOMIALS CONNECTED 
WITH THE CHARLIER EXPANSIONS AND 
THE PEARSON DIFFERENTIAL AND 
DIFFERENCE EQUATIONS* 


By 

Emanuel Henry Hildebrandt 


INTRODUCTION 

The problem of fitting mathematical curves to statistical data 
has commanded the attention of statisticians and mathematicians 
for many years. The curves referred to the most by English- 
speaking biometricians and mathematicians are perhaps those de¬ 
veloped by Pearson from 1895-1916.’- He showed that a series 
of curves could be obtained by assigning various values to the 
parameters in a certain first order differential equation. A few 
years later, Charlier®, attacking the same question from a differ- 

*A dissertation submitted in partial fulfillment of the requirements for the 
Degree of Doctor of Philosophy in the University of Michigan—August, 
1931. 

fKarl Pearson, “Mathematical Contributions to the Theory of Evolution,” 
Philosophical Transactions, A, Vol. 186 (1895), pp. 343-414; also “Sup¬ 
plement to a Memoir on Skew Variation,” Phil. Trans., Vol, 197 (1901), 
pp. 443-456; also “Second Supplement to a Memoir on Skew Variation," 
Phil. Trans., A, Vol. 216 (1916), pp. 429-457. 

*C. V. L. Charlier, “Ueber das Fehlergesetz," Arkiv for Matematilq As- 
tronomi och Fysik, Vol. 2, No. 8 (1905), pp. 1-9; also “Ueber die Dar- 
stellung willkuerlicher Funktionen,” Arkiv for Matematik, Astronomi och 
Fysik, Vol. 2, No. 20 (1905), pp. 1-35. 



380 


SYSTEMS OF POLYNOMIALS 


ent angle, showed that any function could probably be approx* 
imated by using a certain function and its derivatives in the terms 
of the series: 

F(x) f(x) 4- A, +• • ■ • 

where the A/ are constants. 

Oiarlier found that the constants could be formally de¬ 
termined, the n th constant being dependent on the moments 
of F(x) of order not greater than /? . He illustrated the method 
of procedure for the case where f (x) was the equation of 
the normal curve of error, i. e. one of the Pearson curves. In 
fact, the successive derivatives of this particular function gave 
rise to a well known system of polynomials, namely the Hermite 
polynomials, and the coefficients are dependent upon these poly- 
noihials also. 

In recent years, Romanovsky^ has succeeded in obtaining 
similar results for the case in which some of the other of the 
Pearson curves are used as the f (x) in the Gram-Charlier 
series. The successive derivatives of these other special Pearson 
t 3 rpe curve functions also result in systems of polynomials which 
bear fundamental relations to each other. 

It is the object of this investigation to show: 

(1) That the constants obtained by Charlier for his Type 
A series can be much more readily obtained by making use of 
certain existing biorthogonality conditions; 

(2) That if the Type A series be generalized to the form; 

F(x)^QQ(x)fJx)4C,ii. QMft(x)4-C^ ^ Q(x)f^(xM- 


IV. Romanovsky, “Generalization of some types of the frequency curves 
of Professor Pearson,” Biometrika, Vol. 16 (1924), pp. 106-117; also 
“Sur quelques classes nouvelles de Polynomes orthogonaux,” Comptes 
Rendus de L’Academie des Sciences, Vol. 188 (1929), pp. 1023-1(GS. 



E. H. HILDEBRANDT 


381 


where (x) is a pol)moinial of degree n in x, then the 
can also be formally determined and depend upon the moments 
oi F(x) of order at most rj ; 

(3) That the form of the polynomials obtained by Charlier 
and Romanovsky for certain solutions of the Pearson differential 
equation can be found for any solution of this equation and that 
the relations existing between polynomials of the same system 
can also be generalized for the general solution and for the most 
part obtained without having the explicit form of the solution; 

(4) That results analogous to those obtained in (1) and 
(3) can be derived for ^e Charlier Type B series and the analogue 
of Pearson’s differential equation, finite differences replacing the 
derivative. 

The writer wishes to particularly express his appreciation to 
Prof. H. C. Carver for the valuable aid he has given both in the 
stimulating instruction characterized by frankness in indicating 
unsolved problems in his classes and through direct suggestions 
in the preparation of this paper. 



382 


SYSTEMS OF POLYNOMIALS 


CHAPTER I 

Polynomials Connected with the Gram-Charlier Series 

1. In the articles entitled “Ueber das Fehlergesetz” and 
“Ueber die Darstellung willkurlicher Funktionen”^ Chariier 
proves the following well known theorem: 

Charlier's theorem for series of type a —If /?( X ) M 
any recU valued function of oc, which has finite moments of all 
orders, then P" {x) may be formally expressed in terms of an¬ 
other function f (x) and its derivatives as follows: 


(A) F(X) f(x)*A, + •• • 


where f i^x') has the following properties: 

(a) f {x) its derivatives are continuous for all real 
values of x, 

(b) f {x) cmd its derivatives vanish for x^^t^and 

(c) x) 0 for all m and n , 

(d) f f (x)dXf^O. 

- ^ 

The conditions (c) and (d) are not given in Charlier’s ar¬ 
ticles, but an eaxmination of the proof shows that he assumes 
implicitly that they are satisfied. f(x)’ satisfies 

(a) and (b) without satisfying (c) and (d). 

In the first section of the latter paper, Chariier detennii » 
the constants. . .,A„, . . . He takes ' e 
series (A), multiplies it successively by 1 , x, x^, . . and 
integrates each result between the limits - e® to + . The fol- 


^C. V, L. Chariier, loc. dt 



E. H. HILDEBRANDT 


383 


lowing equations result: 


[JpMdx=A,£^ ffx)dx 


X P(x)dx X ffiOdx -hA^ y ^ 


x^f'6ddxi’Aj^f x^f''(x)dx 


Each of these equations contain a finite number of terms and 
the constants ^4^, . . . may readily be determined by 

solving them. In fact we find that any constant A ^ may be ex¬ 
pressed as 

^ J j%(x)F(x)dx 


where Py^{x) is a polynomial in x of degree not greater than 
77 . An analysis of the underling facts reveals that what Char- 
lier has actually done is to show that under the conditions listed 
in the theorem there exists a uniquely determined set of poly- 
nomials P„ (x), ^ (x). • - >^(^). • • •. Htix) 

at most of degree v , biorthogonal to the set of derivatives or 
functions of /* ( x ), i. e. satisfy the biorthogonality conditions: 

J^^FijMf^Cx)dx « 0 for w^rj 
* 1 for 777*77 


Further a study of the coefficients of these polynomials shows 
that 


dn,(x) 

dx 


-n,.i 




i. e. we have the following theorem: 



384 


SYSTEMS OF POLYNOMIALS 


Theorem: If f (x) satisfy the conditions (a), (b), (c), 
and (d) of Charliei^s theorem for series (A) and if /^ (^)> 
Pi (x), . . . fly(x) . . . is the system of polynomials 
in X , (x) of degree at most 77 , which is biorthogonal to 
fix) and its derivatives, i, e. satisfies the conditions 

- 0 for /r?/ 77 

« 1 for 777*77 


then 

This can readily be shown to be true directly from a use of 
the biorthogonal property. For integrating by parts we obtain: 


The first half of the right hand side of this equation 
vanishes due to condition (c) of Charlier’s theorem for series 
(A). For the second half we have 

~J^ “ 0 for 777/^77 

« 1 for w-n 


But we know that 

■>-r\i(^f^”"%)dx = 0 for 7777577 

1 for 777 « 77 


determines uniquely the polynomials (x). It follows that 

df^(x)/dx~-P^^(x) 



E. H. HILDEBRANDT 


385 


A corollary to this kst theorem may be stated as follows: 
Corollary: 

If (^)f^(x)dz~0 for m^n 

for 777= 77 

(i-0, 1, 2,.), then 

dP,Cx)/dx-i^ 

The proof is similar to the one just given. Integration by 
parts gives the following result: 


p'rx)dx‘0 

for 777 7^77 

"°-v 

for 777-77 

But we know that 


P„.j(x)dx - 0 

for 777/77 


for 777*77 


Therefore we may conclude that 


or 


1 dP„(x) 
a„ dx 





dP„(x) 

dx 





An illustration of this corollary is the case of the well known 
Hermite polynomials which are involved in Charlier^s first paper.^ 
These satisfy the conditions 

1C. V. L. Charlier, loc. dt Charlier uses as f ^ 

this paper we shall use the simpler basic function 




386 


SYSTEMS OF POLYNOMIALS 


JjIj„{z)H^(x)e'’"^dx =0 for 777j<n 

- 2^n!/n for m’rt 

and 

(-1)” d” (c'^^Vdz^. 

Hence 

f*'Hjjj(x)d”(e.'^^)/dx” dx = 0 iorm/n 

={-2)'’n! i/rt forw*;? 

If then f {x)~e~*^ and = our corollary 

applies, i. e. we have 

dH^ (x)/dx = Z (x) 

We might further observe that if a^” {-1)^7)! then the 
polynomials {x) form a system of Appell polynomials^ satis- 
f 3 nng the relation 

dJ^ M/dx^ ^^.2 M 

the 77 th polynomial being the coefficient of h^/nl in the 

/ hx. 

expansion of cc (/?) e where 

H H ^ 

a(h)~^^ i-j, oi, i- cc^ + ‘ ■ ■ + -jj] •L„+ • • • - 

The fact that differentiation of the 77 th poI_ nomial results, 
in the negative of the ( 77-l)th pol^momial^shows that thci?? h 
polynomial may be obtained by integrating the ( 77-1) th c.*e, 

iM. P. Appdl, “Star une dasse de Polynomes,” Annales Sdentifiques dc 
UEcole Nonnale Superioure, Vol. IX, series 2 (1880), pp. 119-120. 




B. H. HILDEBRANDT 


387 


which will consequently determine all of the terms of the n th 
polynomial except the constant This constant may be found 
frcMn any of the ccmditions of biorthogonality. The simidest of 
these conditions is: 

J^ PnM fM dx. 


Setting 

(x)dxi-c 

o 

gives P„.^ Cx)dx^c}f(x)dx -0 

^ /.r t /„\i (X) dx ] f(x) dx 

Lt0 fMcbc 


so that 


O 


rji I \i dx] f(x)dx 
f ’ffx)c(x 


This gives a very simple and elegant method of writing down 
successively the pol3momials associated with any function f ( X) 
satisfying the conditions of the theorem. 

Using the Charlier notation 

1 fZ.^”fCx)dx 

S. 


and observing that =l/^o> obtain die fcdlowing 



38B 


SYSTEMS OF POLYNOMIALS 


polyncHnials: 

rs /•’^ ffz)dx. 

PiCx) = -J Jl(x)dx+ -—3- 

f(^)d» 


» Aj 


f^(x) = -f r^(x)dx+ -- 

L 


X. 




P^(X) = -f’^f^{x)dx + 


?(,X 


f{x)dx 

K 


CfMdx 


\x* ?ifx \x 2?^g\ 

13A. ^IsAf " a! 




2. Just as the Hermite polynomials, based as they are cm 
the derivatives of e , are the starting point for expansions 
of the Gram-Charlier type and for the theorem just considered, 
so the Laguerre polynomials defined by c£^(iz¥-3x)'^e '~^/dx ^ 
suggest an expansion of the type 


F(x)^CMx)f^(x)*C,^fJx)fCx) - 4 Cx) 


iriiere 4 (^) “ a polynomial in * . As a matt e r of fact we 
can state the following theorem: 





E, H. HILDEBRANDT 3B9 

Theorem : li <p (x) is a function such that 

(1) <P(x) and all its derivatives are continuous for all 
real values of x , 

(2) (p\ X ) and its derivatives are zero at x^i-oo and - oo, 

( 4 ) is a sequence of polynomials in x such 

then there exists a unique sequence of polynomials P^p (jk:), 
{x) at most of degree m, such that 

0 for 7n/n 

“1 for /77 «/? 

If (x) is at most of degree v , then the determination of 
(x) depends at most upon the moments of <p of order n } 
The method of proof is modelled on Charlier*s proof for the 
preceding case. By substituting in the r? th integration by parts 
formula 

iu(x)v”*Ux)dx^^y^”^- 

we have 


^The Laguerre polynomials are not a special case of this because there the 
interval of integration is - a/ b to , 



390 


SYSTEMS OF POLYNOMIALS 


c£‘ „ 1 

because of conditions (2) and (3) on ^ (^x). As a consequence, 
if 77 > 777 then (x)- 0, so that for n>' 7 r? 

^Cx.}dx^O 


tfiat is to sayf^ix'\ is orthogonal to ^n f„ (x)^(z) 
provided 77 > m Hence ?^ (af) must satisfy only the follow¬ 
ing 77-f-l equations: 


/Tar 


f„<>i;)fifx}aX’0 


rt, ^ r**‘dPfx) 

J^O‘)^fiCx)<PCx)dx = (-l)J^ —^ f , Cx)41 Cx)dx = 0 


= (- 0 1 — ^i Cx) <pCx)dx. O 


^(»Of„(x^(fi{x)dx^l 

Replacing now (x) by ^• - . +a„x\ - 

gives us the system of algd>raic equations to be satisfied by 
. . CL„ j viz.: 





£. H. HILDEBRANDT 


391 


/ •hOO 

t (x)^(x)dx 

+ ag_f x‘f^Cx) 4>(x)dx + . ^oMi>(x)dx = O 

^ (fiMdX’fZaj^ f x.f^ tx) <fi(x)dx^ • - -hna^ f 

*'- 0 » •'-Oitf 

' Za., f f, (x)^{x)din----h7Hn-i)a„f}^%fx)P(x)dx=0 
(^'^)!o„. 2 jf,^^(x)4>Mdx+^-^\,jjxf^,^(x)^{x)dx _ 

(Ti-l)ta„.Jf„_j {'x>^(x}eXx-f Yi * 4./ MHx)dX‘0 

(-0 ^Tt'.a^ (x)(pCK)<dx = J 

We have here a unique determination of a„ if the determinant 
of the coefficients is ^ 0. This is true since the determinant 

... . C/ is/0 because of the 

condition (4) on (p. If ( a?) is at most of degree rv, it is 
obvious that the determination of the P^{x) resulting from 
the coefficients depends at most upon the moments of (P 
of order *77. 

The first three polynomials of the type considered in the 
last theorem have the following form, the limits of integration 
being - oo and + in each case. 


p Jx^fx)dx _ 

' Jf,M^Cx)dxfyfx)dx 


X _ 

Jf^MPMdx 


1 

Jfj Cz)<pcx)dx 


J X <p fx)dx 
J(pcx)dx 



fx ^ Cx)<p(x)dx Jx ^Cx)dx _ ^ <p(x)dx 

zijf^(x)p(x)cixfppOdz 





SYSTEMS OP POLYNOMIALS 


x.Jx fi Cx) <P(x)dX _ ^ ^ 

ff^M^Ci£}c£xff^Cx)<p(x)cC)^ W<fiCx)dx 


JX fj^(z)4>Cx)dxfxf^rx)^Mdx/x^fx)dx 
Jf^ CxId^Mdx ffj, Cx}^rx)dx/f,fx)^rx)dx/<pfx)dx 

Jz% rx} 4 ^ex}dxjz<p^x)dx 
zifii fx)iptx)dj£jft rx)^rx)dxjf^fx)dx 

Jzt, C:d(p(x)dxJ:i^4^Mdx 

Zlff^ (x)(p(x)dxf ^ fx)^(k)d%j$ft)dx 


Jx^4>fx}dx _ zJxfg^tx)*pMdzfzff{x)^fx)dx 

iXx)dxf<ffje:)dz JfsMfi/xJdxJf;^ (x)iPtxJdzJfj>fx)^fx)d^ 


j, xfx^f^M^fxJdx _ ^ z^fzf^(z)4>Cz)dx _ 

^ M i^^x)dx zif(x)^^{x)dzJ 4 rx)^(x)dx 


_ 

siff^(z)<pfx)€ix 



£. H. HILDEBRANDT 


393 


CHAPTER II 

Polynomials Connected with Pearson^s Differential 

Equation 

1. In the work in mathematical statistics a large number 
of the problems that require study involve data properly classified 
into groups and about which further information is sought. This 
data is often classified to form a frequency distribution. The 
frequency distribution when grouped may appear to lie on a cer¬ 
tain curve. If it can be shown that this curve is a mathematical 
curve, i. e. one for which we are able to set up an equation, then 
this frequency distribution can be readily examined and studied. 

There are very few frequency distributions which actually 
conform to known mathematical equations. However, there are 
certain curves which seem to lend themselves much better to 
statistical manipulations than others. Among the most commonly 
used of these are the so-called Pearson type curves. Pearson^ 
shov/ed in a series of three articles how he obtained the equations 
of twelve distinct curves and this was done by considering the 
differential equation 

y dx 

and solving it, after assigning particular values to the parameters 
CL^ y cZj j i y and . The equations of these curves and 
the differential equations from which they were derived are as 
follows: 


iKarl Pearson, loc. cit 



Difj ential Equation 


394 


SYSTEMS OF POLYNOMIALS 



Ca,* xXag^^x) 



£. H. HILDEBRANDT 


395 


The curves most widely used are the normal curve of error, which 
Pearson calls Type VII, and the Type III curve. 

Suppose a Pearson curve f (x) has been found which 
seems to fit a given distribution fairly well The question may 
well be asked: Is it possible by means of analytic methods to 
approach even nearer to the given distribution? For example, 
would it be possible to use this approximate function 
as the f (x) in the Charlier series (A) and thus obtain a closer 
approximation to the observed frequency function. 

Charlier in his paper “Ueber die Darstellung willkiirlicher 
Functionen'*^ considered this question for (p 
i. e. the normal curve of error. He showed that using this 
(p reduced the series (A) to the form: 


(A') F(x) 




the first and second derivative terms vanishing due to the proper 
choice of constants. This series (A') is frequently referred to 
as the Gram-Charlier T 3 rpe A series. It is worthwhile to note 
that this ^ (;tr) is tlie same one whose derivatives we found in 
the first chapter resulted in the Hermite polynomials. These poly¬ 
nomials have the following interesting properties^: 

( 1 ) Cx)/ax - 2 ^/ 4 ., (x) 

(2) M-Zx/4 (x)+ZO 

( 3 ) rz) -f- 277/4 

The first of these relations shows that the derivative of any Her¬ 
mite pol 3 moniial corresponds to the preceding polynomial multi- 


IC. V. L. Charlier, loc. cit. 

*R. Courant and D. Hil'bert, Methoden der Mathcmatischen Physik, 1. 



396 


SYSTEMS OF POLYNOMIALS 


plied by 2 77. The second equation is a recurrence relation between 
the ( 77*^ 1) th, 77 Jh and ( 77 - 1) th polynomials, while the third 
relation is a differential equation of the second order involving 
only the 1 ? th polynomial. 

The use of the equations of the other Pearson type curves 
as the {z) in the original Qiarlier series has in recent years 
been studied by Romanovsky. In the first* of two articles, he 
discusses the Pearson Type I, II and III curves as well as the 
Type VII—^the normal curve referred to in the last paragraph. 
Just as the normal curve of error requires the use of the Hermite 
polynomials, he found that the Type I curve and Type II, which 
is a special case of Type I, involved the Jacobi polynomials 


Grj(/7,q.x)’‘ 




d” 

dx^ 


X (i-X> 



The -n 'th Jacobi polynomial satisfies the second order differential 
equation* 




which corresponds to property (3) mentioned for the Hermite 
pol 3 niomials above. The T)q)e III curve involves the L-aguerre 
pol)momials^ defined by 



and these in turn satisfy the recurrence relation 


IV. Romanovsljy: ‘‘Generalization of some types of the frequency curves 
of Professor Pearson.” op. at pp. 106-117. 

2R. Courant and D. Hilbert, op. cit, Vol. I, p. 75. 

Courant and D, Hilbert, op. cit., pp. 77-78. 



E. H. HILDEBRANDT 


397 


^ni -1 

and the differential equation 

L'^ (x) - 77 (x) • -V L (x) 

In the second article^, Romanovsky reviews the cases of 
the T 3 rpe IV, V and VI curves. The generalization of the T 3 T)e 
IV curve gives the polynomial 

where Q = arc tan %/a . These polynomials possess properties 
similar to the other polynomials mentioned, viz.: 

Cv+1, z)=\z(wl-Tn)x- \/^f^ (77,x) 

\ 

+Zv^-t-l-rT)\ (a^-f-x^)J^_2 (nx) 


and 

(a^-f’x‘)P^ ( 77 ,x) + \_Z(l-Tn)x - 

(rr.x)- n(n+I-2m)7^ (r7,x)‘=0 


Similarly for the Type V curve he finds the polynomials 

v/ .77 

_ , h X d / -ft+zn „ ~x) 

Pr,(h,x;-x e e > 


Also the relations 


2V. Romanovsky, "Sur quelques Classes nonvells de Polynomes othogonaux,” 
loc. dt 



398 


SYSTEMS OF POLYNOMIALS 


\j^-n*2-f^x+^f^(7>,7d-i- -n (2r7-i-Z-p)x^f^^{7i,3d 
and 

r* P^(Tr,x)-t- [^xC2-p)t sf^Pj, (n, x)-n(i?*l-p) P„(77,x)’0 
hold. 

Finally for the Type VI curve Romanovsky gets the pol 3 moniials: 

Pyj (-h.q, x)- ix-a) ^x ^ \^x-a) 

and the relations: 

(77^t, z)^ \f-p^l)fx-a)^(qi-l) ai] ^ (??, z)'^x(x-a)Plj (rt, x)^ 
¥aiP^(7?,x)4\S-pUXx-cd'i‘(q^i)z\ Pj, (rj,x)-n(v4’lHi-p)J^(Tj,x)»0, 


We note, therefore, that if a solution of the Pearson dif¬ 
ferential equation is used as the generating function f (x) in 
the Gram-Charlier series, that a distinct set of pol 3 ntiomiaIs re¬ 
sults in each case and that these polynomials satisfy certain re¬ 
currence relations and differential equations. These properties 
are not found in the case of functions such as sech x and 
sech ^x, which were discussed as generating functions by Char- 
lier^ and by Roa^ respectively. The successive derivatives of the 

V. L. Charlier, "Ueber die Darstellung willkiirlicher Fuiiktionen/ loc. 
cit., pp. 18-22. 

^Emeterio Roa, "A Number of new generating Functions with Applica¬ 
tions to Statistics,” Doctor’s Thesis, University of Michigan, 1923. 




E. H. HILDEBRANDT 


399 


sech X do not result in polynomials such as the Hermite or Jacobi 
ones. 

Since the generalization of the solutions of the Pearson 
curves leads to distinct sets of polynomials and since these poly¬ 
nomials satisfy certain fundamental relations, we are led to inquire 
whether these polynomials are not special cases of a general poly¬ 
nomial and may be obtained from it by specializing the coefficients 
and further whether such general polynomials, if they do east, 
will satisfy certain recurrence relations and differential equations. 
These problems are among those which we shall consider in this 
chapter. 

2. In order that we may develop the generalized polynomials, 
let us consider the Pearson differential equation where the numer¬ 
ator is of the first and the denominator of the second degree, i. e. 


y dx ~^Tl^x7b^ 


For convenience we shall denote the numerator by N and the 
denominator by D . We then have the following theorem: 
Theorem : // y is a non-identically zero solution of 


( 1 ) 




cLx 


then ^ is a polynomial of degree at most v . 

The proof will proceed by mathematical induction. It is 
obvious that the theorem holds for 77-1, {x) being N. 

Since it is true that 


D 



Ny 


we obtain by diflferentiation 


d^y 


+ D 


dx 


N 


dy 

dx 


y 



400 


SYSTEMS OP POLYNOMIALS 


or using (1) and multipl 3 dng the equation through by Z) we get 

Since D* is linear and is a constant, it is obvious that 
(- /VZ? N !D) is at most of degree 2, 

Assume then that the statement holds for W t 77 and we 

have 

( 2 ) B” - P„My . 

Differentiation gives 
77d^-^d'-^ * 

Multipl 3 dng through by D we get 

and using (1) and (2), we have 


M y - NP^ My - irD'P^ My 


< NP(x) - ’nD'PL (x) +D 
_ ^ dx 


y- 


The coefKcient of y is obviously a pol 3 momial of degree at most 
77¥-1. Incidentally we have derived the relation: 

(I) Prr-,1 (^)-Pn nD') + D 

an equation which gives the ( T/y-l) th polynomial in terms of 
the 77 th pol 3 momial and its first derivative (X), 

3. More generally we have: 

Theorem: If y is a non-identically zero solution of (1), 


then 







E H HILDEBRANDT 


401 


is a polynomial X) , f^{k,z) is at most of degree 

77 in X , In particular if k-v, we have that 


is a polynomial in x of degree at most n . 

This theorem can be proved directly following the lines of 
the preceding theorem, but it is simpler to obtain it as an imme¬ 
diate consequence of this theorem and the following lemma: 

Lemma: If y satisfy the differential equation (1) then 
D \ , where k is any real number, satisfies a differential equa¬ 
tion of the same type, viz.: 




N^kD' 

D 


D y 


Let u*D^y 

Then logarithmic differentiation gives at once 

1 da ! dy _ N+kD* 

u dx ~ D y dx‘ D 


It follows from this lemma that any result which we derive 
concerning the polynomials Pjjix )»-y D ^ where y sat¬ 

isfies D dy/dx « Ny , is immediately extensible to the poly¬ 
nomials Prj( k ,x) ” -L xj ^rrD‘y\sy replacing N by 
N + kD'. In particular relation (I) becomes 


(h) P„(kd,x)rD- 


which for v reduces to 


(y (77rl,x)^(N*D'}PJn*l,x)-tD^3i(J2^ 



402 


SYSTEMS OF POLYNOMIALS 


We single out the case k=^r7 because of the fact that this 
case parallels most closely the Charlier or Hermite polynomial 
case. For in this latter case the 77 'th derivative of the generat¬ 
ing function e is the product of the generating function and 
a polynomial of degree 77 . So in the case of any solution y of 
a Pearson differential equation, the rj th derivative of D^y 
is the product of the generating function y and a polynomial of 
degree at most 77. 

By means of relation (I), we can write down the successive 
polynomials ^( 2 ^), . . . The first five polynomials 

may be written as follows: 

(N-D') P, M+D SI^-ND'*N'D, 

P, M ^(N-zD'J (x)+D ■ 

= N^-SN^D'* 3NN'D^-ZND'^-ZN'D‘D -ndd", 

(X) ~(/^- 3JD') Pj (X) +D 

= N*6N ^D'+6N^n'd -I4.NN'DD'-4N^DD’', 

= -6ND'^+6N'DD'^*6AfDD'D"*3N'V-3N'D‘'D", 

P 5 M - (N-4D-) P^ (X) 

~N^-tON^D‘*10N^N'D *35NV-S0N‘^N W 

~ion%d"-jon^d‘Wonn'dd' - 40 n^dd'd" 

^/SN^'*D^-Z5NN‘D^D '+Z4ND‘*-Z4N‘DD‘ ^ 

-36 ND D'y' -ZON'Vd + Z4 N'd^d'd"* 



£. H. HILDEBRANDT 


403 


4. Following the analogy with Hermite polytiomiali we ob¬ 
tain next a recurrence relation involving the ( *77 i ) th, 77 'th 
and ( 77 - / ) th polynomials. 

Starting with the original differential equation 

we take the 77 th derivative of both sides, which by Leibnitz’s 
theorem on the derivative of a product gives us, since = 0, 


D 




/ 

V* 77D 


d^y 
cLt: ^ 




77{r7-l) 

Zl 


D 


ud^-^y 

cCx^-^ 



Multiplying this last expression by vaA collecting terms, we 
get: 




Replacing now (^) Y ^md dividing through 

by y , we get the recurrence relation 


(II) 






We note that the coefficients of and Pyj {x) are 

the same as in relation (I) which we found to be 

c/ P fz) 

(x)(-nD'-N) ‘D 


n ' 


Hence 




= 77 




(III) 



404 


SYSTEMS OF POLYNOMIALS 


or replacing r? by 77 + 1 we write: 


“ ('w/Z/A/'-j D")Cx) - (rr^lXa, -nb^) . 


This equation is the generalized form of the one for Hermite 
polynomials, viz.: 

dH M 

77^1 

S. Relations (I) and (III) may now be used to obtain a 
second order differential equation. Differentiating (I), we get: 

P'r,i-s (^) + ^ ^(-nD'-N) P^ (K) 

-D'P^(x)-DP„"M‘0. 

Substitution of the value d (x)/dxitom (III) gives 
us: 

\N-(n-l)l^P^ (x) 

- n [yV - rxj - O 

We readily see that the relation found for the Hermite poly¬ 
nomials 

H" (x)-2xH'^(x)+ Z-n -O 


is a special case of (IV). 

Using the lemma previously proved and replacing H ly 
+ kD' we can write (IV) for the polynomials (/r^ at ) 
and P„ {77, x): 




^)+ m fk,X) 

-77 jW'- Pr,(f<x)‘0, 



E. H. HILDEBRANDT 


405 


(IVr,)^ 


DP„"M* (N-^D‘) P^(77, x) 

- 77 [a/V i^Z»'] (n, x)^6 


We recognize the second order differential equations mentioned 
earlier in this chapter for the pol)momials of the Pearson Type 


iSince V is any expression of the second degree and N is any expression 
of the first degree, it is obvious that ^ ) satisfies a linear equation 

of the second order of the form: 

(A^+A^% (Bo + 3, x)y‘*-Cy:^0 

where ^ “ ~t7\jt7 -1) ^ . It may be shown that if a differ- 

ential equation of the form considered has as one solution a polynomial of 
degree 77 then C must be of the form specified. For suppose ( ^ ) 
satisfies the above differential equation for y. Taking the 77 ^th derivative 
of this equation we get 

lnla^)*TiB, {77/a,) ^C/'Tr'aJ-O 

and solving for C that: 

C^-T? S^77-l)Ag^ + 6/^ . 

It follows from our work that if a differential equation has the form 

(A,+A^x+A^x*-)y'’+ (B,-i-&,x)y‘ 

-n \(-n-l)Ag^ + B^ y-O 

then one solution of this differential equation is a polynomial of degree at 
most 77 obtained by finding the solution y of the Pearson differential 
equation 

dy B.fB.x- (A, * 
dx A, + A^x ^ 

and determining the polynomial 

^17 y} ■ 



406 


SYSTEMS OF POLYNOMIALS 


IV, V and VI as well as the Jacobi and Lag^erre polynomials as 
special cases of formula (IVr 7 ). Some further illustrations of 
(IV^) are the TschebychefP and Legendre^ polynomials. The 
Tschebycheff polynomials are developed from the differential 
equation 



X 

l-x‘’ 


y 


and in this case formula (IVyy) becomes: 
(1- X V ^V z('n, x)+ 


The Legendre pol)uiomials 




Z.’'r?i cLx. ^ 


have as a corresponding difiFerential equation 


cly _ P-y 

dx ~ x^~i 

and in turn formula (IV^.^) is written: 


{7J,x)'tZxP^ (7 t,x) • 77Crj-fl)f^(r;,x)-0 

6. Just as in formula (II) we established a recurrence re¬ 
lation for the polynomials P {x), let us now obtain one for 
the polynomials 77 , X ). 

Consider once more the first derivative of D^y , i. e. 




IR. Courant and D. Hilbert, op. cit,, pp. 73-74. 
2Ibid, pp. 66-69. 



E. H. HILDEBRANDT 


407 


Taking the 77 th derivative of both sides of the equation we get: 

fn\N'-i-(lC^/)D^^p^j D'y . 


Multiplying both sides of the equation by /^^’"^and replacing 

Pn^l (f<^■l.>^)=\NHK■>■l)D^PrK,x) 

(Vk) 

+ T7\_N'-i-(K*i)D"\D r„_, (K, z). 

In case we set /r« 77 , we may write 

^Tjti k) =%■!■( r?-hl)D^P (nx) 

\^v) 

a recurrence relation similar to (II) and involving the poly¬ 
nomials ( 77 / 1 , x), 7^ ( 77 , jr ) and ( 77 ,;ir ). 

7. Formula (V 77 ) may be written in still another form cor¬ 
responding to formula (I), i. e. a relation consisting of the same 
terms as (V 77 ) except that the ( 77 -1) th polynomial ^.jr( 
is replaced by the first derivative of the n th polynomial 

In order to obtain this relation we return to formula (III), 

and substitute for N the value N + kD^ and obtain 


(ii^) 



408 


SYSTEMS OF POLYNOMIALS 


or 

D , , _ 1 _ dP„(n,z) 

P„.l(v.x}~ +(n*‘)&] —— 

Substituting the value for {7T,x) we thus obtain: 

„ Pryfn.X) 

^ N'+C^^^^jy dx 

From symmetry we might expect the fractional coefficient of the 
deri\^tive ( 77 ^ x ) to be unity, but unfortunately this is not 
the case. 

8. In looking over the relations existing for the Laguerre 
polynomials we find one consisting of the first derivatives of the 
77 th and ( 77 - l)th polynomials, and the ( 77 -l)th poly¬ 
nomial,^ i. e. 


(r7,x) - friz) f tt-I z) 

This relation is a special case of another form of formula (VI) 
which we obtain in the foll6wing manner: 

Differentiation of (VI) gives us: 


■ N'-i-C77*}}D“ 


NWr?*jJP“ „ 


. d‘p„ fnx) 
cix* 


Substituting the value for ( 77 , >: ) /dz^ found in (V) 

changes this last expression to the form: 


iR. Courant and D. Hilbert, op. dt, pp. 77-79. 



E. H. HILDEBRANDT 


409 


NWn^-DD" dRM . N'HvHW" 

*~7r^rW^7W' —^— ^ N'T(^r)D" 

^fv^, 

which reduces to 

jVVr>7y/;i9'] p„rn.jc) 

(VII) 

The special equation mentioned for the Laguerre polynomials 
will be recognized as a special case of formula (VII) if we recall 
that for the Laguerre polynomials the differential equaticm is of 
the form 

cty _ p-x 
dx~ X ^ 


Substitution of X iox D and { p-x ) ior N reduces (VII) 
to 

(77^I,x)=-(n^I)f^ f77,x)+(-n+l)Pl, f 77^x) . 

9. In this chapter we have defined two general types of 
polynomials 

and Pyyfk^x)^ —y— ^ y • 

The relationships for these polynomials ^{x) and P^(^k,x) 
were derived without using the form of the solution of the dif¬ 
ferential equation. Two fundamental formulas were derived, 
for 7^ {x): 

(I) P,,, rx). (N- nD')P„ rx;^z7 



410 


SYSTEMS OF POLYNOMIALS 


and for P ( 77. x) the corresponding formula: 

(VI) 

Two successive polynomials were shown to be related by the re¬ 
lations, for F^(x): 

(III) P^,J M 
and for 77^ x): 

(in^) 

In addition we found that it was possiMe to set up recurrence 
relations involving the ( 77 /- 1 ) fli, 77 th and ( 77- 1 ) th poly¬ 
nomials and found these to be, for P^(x): 

( 11 ) 

and for P„( 77 ,;«r): 

(V77) P„^j (rt+i.x)^ [V+ (n+OJp^fv,x)*r^’+('vtJ)D^P 

We further succeeded in devel(^ing a second order differential 
equation for the n'th pol3momial (x) t 

(IV) DP„M^[/\/-/'r7-/jD^ p„'rx; - r 7 [yv'- p„rxj^o 

and for P^iv.x 

(IV„)Z?;^r77,ar>(V^/)';pJ^^,;^;.^)V V ^Vj/> fr 7 ,x) . P 



E. H. HILDEBRANDT 


411 


We also showed that we could derive a relation between the 
derivatives of the polynomials ( t)+\, x"), Pj, ( 77,/t ) 

and the polynomial /^.( n,x')'. 

(VII) 





dPr,(ny^. 

cix 


Finally, we noted that all of these formulas and relations apply 
to the Hermite, Jacobi, Tschebycheff and Legendre pol3momials 
as well as the polynomials derived for die Pearson Type IV, V 
and VI curves by Romanovsky, 



412 


SYSTEMS OF POLYNOMIALS 


CHAPTER III 


1. So far the discussion in this paper has been limited to 
the treatment of the Gram-Charlier series where the constants 
. . .,4^, . . . depend upon polynomials in 
7C which are independent of the function R (^x)y and the gen¬ 
erating function f (x) is a solution of the Pearson differential 
equation, the functions P(x) and f (x) being defined as con¬ 
tinuous functions. The work in mathematical statistics involves 
not only the use of the continuous variate and the continuous 
fimction but also the case of the discrete variate and the discon¬ 
tinuous function where this function is defined for equally spaced 
values. 

In dealing with the continuous variate we make use of the 
theory of the differential and integral calculus, or the calculus of 
limits, as it is sometimes called. On the other hand, for the dis¬ 
crete variate we turn to the theory of the calculus of finite differ¬ 
ences, Further, it usually happens that there exists a paralldism 
between results based on the derivative and integral and those 
based on the finite differences and summations. As a consequence, 
it seems natural to attempt to derive results for the finite differ¬ 
ence case paralleling those contained in the first half of this paper. 
The second part of this paper is devoted to this purpose. The 
first of the two following chapters considers matters pertaining 
to Charlier’s Type B series which is the finite difference parallel 
to the Type A series, while the next chapter is devoted to the 
pol)momials connected with the finite difference parallel of the 
Pearson differential equation. 

Charlier in the second half of his article^ “Ueber die Dar- 
♦C V. L. Charlier, op. cit, pp. 23-35. 



JS. H, HILDEBRANDT 


413 


stellung willkiirlicher Funktionen” considers a real valued func¬ 
tion and asserts that it may be formally expanded in 

terms of another function and its successive differences. Stated 
as a theorem, this may be written as follows: 

Charlier^s theorem for series b : Any red vdued func¬ 
tion F{x) which vanishes for X - oo and - , may he form- 

dly expanded in terms of another function ^ {x) and its suc¬ 
cessive differences in the form 


(B) FCx) Ag(x)-i-Bj^A %(x)-h --hB^A ^g(x,)+ ■ ■ 


where gix) possesses the properties : 

(a) 9 ( x) and its differences are defined for all real values 
ofx , 

(b) g {x )and its differences vanish for x**op and - oo , 

(c) g(x} for all real values of wand n . 

(d) 0. 

Paralleling the theory of the first half of his paper, Oiarlier 
determines the constants , f5t » 3^ , . . . and 

finds that they may be expressed by the equation 

A, ■ £ |t:l 

where <?„( x) is a polynomial in x of degree not greater than 
77 . Analyzing the answers that he obtains for we find 

that these polynomials form a uniquely determined set of poly¬ 
nomials Qgi.x'), Qj ix), ■, O^Cx), • • •. Q„(x), . • •. 
0 (x) at most of degree t 7, biorthogonal in the sum sense to 
the successive differences of the function g x^t i* c. they sat¬ 
isfy the biorthogonality conditions for the inverse of differences: 



414 


SYSTEMS OF POLVmMIdLS 


-1 777 •* O for 77/ TV 

A 1 for m, 

Charlier does not observe that the pol)momials bear a 

definite relation to one another, i. e. 


a relation similar to the one found for the polynomials /^ ( >:) 
in Chapter I. We may state these facts in the following theorem: 

Theorem: If g (^x') satisfy the conditions (a), (b), (c), 
and (d) of Charlier^s Theorem for series B and if Qg,(x), 
^<(^)> • * M ‘ • • is the system of polynomials 

m X , Q^{ of degree at most 77 , which is biorthogo^tal to 
f (x )and its differences, i, e, satisfies the conditions 


A'"Q„fx)A'”gM 


» 0 for rt ^ m 
« 1 for m 


then 

A Q„M . 

The proof requires the use of the finite integration by parts 
foimula: 






Applying this formula we get 


A ^Q^MA^gM 



M a”" g^x) 






The first term on the right hand side vanishes due to condition 
(c) of the theorem of Chariier. Gnuparing the term which 



£. H. HILDEBRANDT 


415 


remains, i. e. 



iox n f^rn 

for n - m 

with the biorthogonality condition 


4 t A 

for rti^Tn 

for 77« 777 


we conclude that 

This theorem enables us to find the terms of the 77 th poly¬ 
nomial by taking the negative of the integral of the ( 77-1) th 
polynomial, except for the constant of integration. Following the 
suggestion in our first chapter, we may also determine this con¬ 
stant. We have 



^ c 

and the simple biorthogonality condition 



A~^Q„(x)g(x) 1 *" 

=0. 


It follows that 



jr . -1 Hx- 





1-0. 


and solving for C we get 




We may therefore determine the polynomials from the 

pol 3 momials next preceding by the formula 







416 


SYSTEMS OF POLYNOMIALS 


If we adopt the Charlier notaticm 

and the common notation xlx-lXx>-E) . 

and observe that C^^(x)= l/Cg and fliat 


^ * “ 777 y-/ 


we may obtain the polynomials (x), Q^(X), . . 
without much computation as follows: 

Q^(x)~ -A'^QJx-hO * y- 

Cx+if*’^ £.fx+l) ^ S£?+€,£i, 


lie. 



or ~e^x*r3c*x^(C-^^) 

- ^ *r-2eo*- *6e^yfE^£^*3e^l*z e, 

-6e^£,£^-6C*£^r6€,^ 




E, H, HILDEBRAND! 


417 


These results differ slightly from those obtained by Charlier 
in his article. This is due to the definition for differences used 
by Charlier, viz,: 

A g(>0 « gCx) -g(x -1) 

while we have used the definition 

^ gC^) * . 

Denoting the difference 

SgM 

Charlier determines a set of polynomials (x) satisf)dng the 
conditions, 

.r for 777 / 77 

d--' [7„ M 6 

- 1 for rrj« rj 

As a consequence by paralleling the reasoning above one proves 
easily that the T^(x) satisfy the recurrence relation 

7^ ('x-i-i) -T^fx)-:- (x) , 

By using this relation and the fact that 

it can be shown without much difficulty that 

r^Cx + n-l) » 0„{X) 

The theorem proved in Ch. 1, par. 2, could no doubt be 
paralleled by using finite difference theory. Since the method of 
procedure is obvious there seems to be no need of taking it up 
in detail. 



418 


SYSTEMS OF POLYNOMIALS 


We have succeeded in showing in this chapter that the prob¬ 
lem of determining the constants for the Charlier Type B series 
closely parallels the work of the first chapter and that these con¬ 
stants are readily obtained by using the biorthogonality conditions 
for finite differences. 



E. H. HILDEBRANDT 


419 


CHAPTER IV 


Polynomials Connected with the Pearson Difference 

Equation 

1. In Chapter II we referred to certain solutions f {X) 
of the Pearson diilerential equation and noted that graphically, 
these functions represented types of curves used in statistical 
work. Paralleling this work, we would expect to find that a dif¬ 
ference equation similar in composition to the Pearson differential 
equation would have as solutions functions g ( x) which could 
be used to represent data consisting of discrete variates. 

Carver, in an article in the '‘Handbook of Mathematical 
Statistics,’’* suggests the use of a difference equation correspond¬ 
ing to the Pearson differential equation, i. e.: 




a difference equation with a numerator of the first and denom¬ 
inator of any desired degree in . If we confine our work to a 
denominator of degree at most of the second in x.y^e should be 
able to obtain results comparing very favorably with those ob¬ 
tained in the second chapter. 

An illustration of a solution of this diffefence equation found 
in Charlier’s article “Ueber die Darstellung willkurlicher Funk- 
tionen,”® is the well known Poisson exponential function 


iH. C. Carver, “Frequency Curves,” Handbook of Mathematical Statistics 
(H. L. Rietz, Editor), Chapter VII, pp. 111-114. 

^C. V. L. Charlier, op. cit. p. 33. 



420 


SYSTEMS OF POLYNOMIALS 


This function satisfies the difference equation 




X -hi 




and this equation is recognized as a special form of the Pearson 
difference equation. If we take the successive differences of this 
Poisson exponential function, we find that these give rise to a 
unique set of polynomials. These polynomials may be written in 
the following form: 

Qi A-(^+1^ , 

Qg^Cx) = 

or making use of the usual difference notation for 

xCx-i)(x.-&) . ( 7 c-rn + l) , we write 


Q^(x) = Z-?<(x+Z)-i-(x+Z) 


(x) • 3?!^(x-i-3)-i-J 


or Q,{x- 3) ^ 3^^x -h 3 Ax^^^- X 


These poIynMnials have the same form as that for the W- 
nomial expansion i^A-x )" , particularly if we use the differ¬ 
ence notation for representing powers of x. In other words, we 
might look upon the ;; th polynomial as being ddined as 





£. H. HILDEBRANDT 


421 


[A- 

A careful examination of these polynomials brings out the 
fact that consecutive ones are related to each other, viz., that we 
have, 

A Q„ M « - • 


This relation is similar to the one found for Hermite polynomials. 

The fact that the Charlier Type A. series in Chapter II 
consisted of successive derivatives and that the derivatives of the 
solutions of the Pearson differential equation led to a system of 
polynomials definitely related to one another, gave rise to the the¬ 
ory developed in that chapter. We found that it was not neces¬ 
sary in this theory to consider the form of the solution of the 
equation, but that a set of general polynomials could be set up 
which satisfied all the properties of the special polynomials. The 
Charlier Type B series consists of successive dififerences of a func- 

tf 

tion ^ ) and it is quite natural for us to suspect that we can 

develop for the solutions of the Pearson difference equation a cor¬ 
responding theory on polynomials. 

This question of obtaining a system of pol 3 momials from the 
solutions of the Pearson difference equation 


( 1 ) 



t) ^ •f’ t)^ % ^ ^ ^ 


numerator of the first degree and denominator of the second 
degree, will concern us in this chapter. We shall further show 
that these polynomials are related to one another by means of 
first and second order difference relations and by means of re¬ 
currence relations involving the ( 77-3^1) th, 77 th and ( 77- 1) th 
polynomials, and shall illustrate these equations with the Poisson 
exponential function. 



422 


SYSTEMS OF POLYNOMIALS 


2. For convenience denote the numerator (aj, + Ojx)in equa* 
tion (1) by and the denominator ( b^-f- b,z 4^*) 
hy . We may then define a set of polynomials by the follow¬ 
ing theorem: 

Theorem: If is a non-identically zero solution of 



then ® polynomial of degree 

at most 77, i. e. (x). 

The proof will proceed by mathematical induction. If we 
recall the formula for the difference of a product 




we obtain by differencing 


D^A ‘ Qi (x)u^ 


the equation 


^x+1 ^ O^'I- 


Using the value for A fron the original difference equation 
and multiplying the equation through by , we obtain: 


•^x ^xH • {'^x QA)^- N^AQ, (xJ^N^AD^] 



£. H. HILDEBRANDT 


423 


Since the coefficient of is a polynomial of degree at' most 2 
in , we write 

Let us now assume that the statement holds for m 6 rj, u e. 

Differencing both sides of this equation gives us 

OAA,, ■ An-A ’X 


Now 


^ ^ ^c-f-!' rj-t “ ^x+1 ' ^c-i- 77 ^-/-i 


Hence by the definition of M 

a\pa, ■ 

Substituting these values in the above equation as well as the 
value for A from (1) and multiplying by , the equation 
reduces to 

An M 

The coefficient of on ffie right hand side is a polynomial of 
d^ee at most tj in x . We therefore conclude that 



424 


SYSTEMS OF POLYNOMIALS 


^ ‘^x“ ^rn-i • 

We have also succeded in deriving a relation similar to relation 
(I) of Chapter II, i. e. 

(XI) 


+(N^+£)^)AQ„(xX 


a relation which shows that the ( t;-/- 1 ) th polynomial is made up 
of the n th polynomial and the difference of the 77 th polynomial. 
This relation differs from relation (I) in the fact that the co¬ 
efficient of A (x) is -h instead of , This 

change seems to be connected with the fact that the original dif¬ 
ference equation 

can also be written 


a 








Formula (XI) may also be written 


(3a ) 

siiu* . 

It seems advisable to adopt a notation for the term 

^x+i ^x+z ■ * ■ ■ ^x+n-i 

since it will continue to be involved in the work that is to follow. 
The difference notation xCx-lXx-Z) • • • • {x~m+ J) 

su|^;ests that we use the symtx^ »*• ®* 



E. H. HILDEBRANDT 


425 












Then we will have 




'D. 


'z’htj-X 


r77) 


and 




■D D 

^X+Z ^XH 


’ -^> 77 -^ ^c•h rj-z ^ n '^z 


{r7~l> 


3. We may also define the general polynomials Q^{7T7,z) 
where m is any integer, by means of a theorem as follows: 

Theorem * If is a noft-identically zero solution of the 
difference equation (1), then 


x-m-^rr-i 






is a polynomial fm, x) , and m^z) is at most 
of degree 7 j in x , In particular if m^V , ^e have 



Cr?) 

z-i 



is a polynomial in x of degree al most v . 

This theorem may be proved by using the following lemma: 
Lemma: If satisfy the difference equation (1), then 
, where rr? is any positive integer, satisfies a differ¬ 
ence equation of the same type, viz,: 

x-m 

The proof proceeds easily by mathematical induction. 

For 777= 1 we have 




426 


SYSTEMS OF POLYNOMIALS 


^ [Px-i ^ 

u.^ 


^Ti U \^X. '^^■K ~^x-i\ 


For 77T » 2, we get 


’^y^x-z^^x-i 




D^ 


or 


A\DZu,\.D“a, 


oj-je 




X-2 


Let ns assume that it holds for the 777th case, i. e. 


^t4-r“J-47"“ 


^ r "-^-777 

'X 




Then 


:-7>7-i 


- ■'i.™ 4fl2’''‘d^£C'“« All, 


^y^x-i 


N^+D^-D, 


"^x-xn-i 


Making use of this lemma in proving the last theorem, we 
note diat 

* 

x.X X‘ X-X X 



E H. HILDEBRANDT 


A27 


and in general that 


or 




X-rn ^Tt-l 
Cm) 




In particular, if m= r?, we define the polynomials (n,x) 
as A ^ (tj, which relation is of in¬ 

terest because the has no D^; as multiplier. Any result de¬ 
rived for the polynomials Qj^(')£)= ■- A where 

is a solution of the diflference equation (1) can now be extended 
to the polynomials (m^x) ^ ^ replacing 

by ( ) and i^^by . For ex¬ 

ample, relation (XI) becomes 

Q„^/n7*I,x)^ (N^->-I>x'^x..rr,+ v-l'^^n(^-^^l 

(XI^ 

Cfn+1, X.) 


and when ttj- n , this relation reduces to 

(xi„) 

* (N^+AD^_C)Q^ Cn+t, x)*(Qy, ( 

4. In analogy with the work of chapter II, we next proceed 
to find a recurrence relation involving the (77V^l)th, 77 th and 
( 77 - 1) th of the polynomials <p ( >: ). We take the 77 th differ 
ence of both sides of the equation 





428 


SYSTEMS OF POLYNOMIALS 


by making use of the formula for the 77 th difference of a product 


^ *r?ay^A 

, TTfn-t) ^z. 


TT 




We then obtain the equation 

N^A * ttAA , 

and A^N^ being equal to zero. Multiplying through 
byi 3 ;f^ we get 

J^x7i, ^ r,PTJn 

• - ”- 0 . 7 ^ ^ »,-a 

But and * “a; *-^^Utx *- A^Ux . 

Substituting these values in the last equation and using the defiin* 
ition for the polynomials (x) , we obtain: 

[Dx^„ <?„ (x)^ <?„^jt^J]<^x 




A^D^ 






€Jt) 




X^Tf ^ 


X4ff 






Dividing through by 'and ccdlecting like terms, this expression 
reduces to 



E. H. HILDEBRANDT 


429 


iy- 'n(Ti-l) _£^ 




A 




XL 


r7l>^^„AN, 


~ '^■^X+77-t ^x+r» ^ 

Now we know that 


u. 






7>(-n-i) ,,z 




^/77 


= 4^^ * rjA V- —J7— /3 44^ ^- 


and so we may write and in this same term, i. e. 

•0„„ ‘D, * 7,AD, ^ ’’i7"4*D„ 


AD,,,. AD,.r:A'D,. 


and ^ 

the third and higher differences of and the second and higher 
differences of A4 being equal to zero. The coefficient of 
reduces to —and the coefficient of 
(x) also reduces to a simpler form. Dividing tlirough by 

2 . 7D 

... ^ we finally get the recurrence relation: 

^r/4.± 77- i ^ 77 ^ ^77 

(XII) 

i. e. the ( 77-^ 1 ) th polynomial may be obtained from the 77 th and 
( 77 - 1) th polynomials. 

In Chapter II we found that relations (I) and (II) were 
identical for the first two terms, and as a consequence we equated 
the third terms and obtained a relation between the derivative of 



430 


SYSTEMS OF POLYNOMIALS 


a polynomial {x) and the polynomial preceding it. In order 
that we may obtain a similar expression for the difference poly¬ 
nomials, we must change the appearance of formula (XII). 

By lowering the degree in formula (XI ) from r? to ^-1 
and solving for 

M)-Q„ (x). 

Substitution of this relation in formula XII gives 

Qr,*i M 

» Qr,» M. ,JAi - 

Just as in Chapter IV, paragraph 3, the coefficient of (z) re¬ 
duces and becomes the same as the coefficient of (x) in 
formula (XI) and we have 

C>^) 

(XII') 

We therefore conclude that 

(XIII) rriAN^- \Q„.t 

a relation expressing the difference of a polynomial (x) in 
terms of the next preceding polynomial in (x f 1), i. e. 



E. H. HILDEBRANDT 


431 


^ 77 -y ( ^ polynomial (n^^c:), formula (XII) 

may be written in the form 

this relation being obtained by replacing by 
and by . 

Formula XIII which was just derived is the general form of 
the relation we found to hold for the Poisson exponential function 
polynomials, i. e. 

AQ^kx)^- fx+1). 

We find further that these polynomials satisfy a special form of 
(XI), i. e, 

Qrii.i (x)-h(x^Ti+l-A)Q^ (x)-?<AQ^(x)^0 
and for formula (XII) we get the special form 

^ ^ {x)^77(x^ 77) (p^j (x) - O 


This recurrence relation is also similar to the one given for La- 
guerre polynomials. 

S. Turning now to the problem of obtaining a second order 
difference relation for the polynomials (x) , we proceed to 
difference formula (XI), i. e. 





432 


SYSTEMS OF POLYNOMIALS 


and get 

*(A N^*AJ)JA(?„ M / *-Px*i)^ ’’<?r, M • 

Substituting for ^Qn^i value 

CntJ) \AA/^ - §A*D^ \q^ (x)*A Q„ 

found in formula (XIII), gives us 
(■n-tl) \6N^ -1A *dJ [<?„ (X) -i-AQ^ - 

pl/V, *AD^ AQ„ (x) 

+ A^Q„(x). 

Collecting the coefficients of like terms and simplifying them, we 
finally get 

A9„(X) 

(XIV) 

-n\AN^ - Q„rx).0, 

a relation very similar in form to formula (IV) and consisting 
of the first and second differences of the polynomial (x) 
This relation when applied to the Poisson exponential function 
gives 

^A‘'Q„(x)->-C?^-x.1)AQ„('x)^ n(?^ (x)~0, 

an equation which can be checked by substituting the value of 
the general Poisson polynomial in it. 

The extension of formula (XIV) to the polynmnials 
and (yf,xO by making the proper substitutions for 



£. H. HILDEBRANDT 


433 


and results in the following expressions: 

(XIV^) 

- .AD^-AJ)^.„ -(■^)A%.^Q„ £ 7 ^ 

which may also be written as: 

* ■>-(m-n*l)AI^ - a"d^ AQ^(m^x) 

- T7 [a/V^ - {yrix)^0 

In particular if m^Tj we have: 

(XIV^) \ri.r,,, ^^,-^A^j>;^A<?„ 

- n\AA/^ * ^A‘D^ (p^ (y,, x)x O. 

6. The next set of relations we shall derive are recurrence 
relations for the polynomials Q^fTn^ip) and . 

In the lemma proved in this chapter we found that 

Taking the 77 th difference of both sides of the equation gives: 

the second difference of the trinomial being 

equal to zero. Multiplying this last expression through by 



434 SYSTEMS OF POLYNOMIALS 

the value we get 

= (TA; *D^ ■ 

* 77« A A „Jjf’'"’(-^) 

Dividing through by we get a recurrence relation 

involving the polynomials ^ 714-1 Qf, ( 

and (rn, x+l) , i- e. 


(XVw) 

+ t^AN^ ■•■(m-il)/^D^ (f^^+Djg) ^Ti-i (^■ 

For 7T7« 77, this expression reduces to: 

^x*^x--^x-n-iX^77 

(XV„) 

■•■•n^N^ + fn4l)A^D^(N^+D^Q^ i (v, x+l). 

7 , Another form of this relation is obtained by substitut¬ 
ing the value found in (XIII77) for ^ Cn^x+i) , i. e. 


Qjj,^(r 7 ,x+l)- 


1 _ 


AQ^fn,x), 


in formula (XV» ), which gives 

Qr,4S M 




AN^*( 774 l)A^Il^ 


^Qr, Of.*), 


(XVI) 



E. H. HILDEBRANDT 


43S 


a relation very similar to formula (VI). 

8. There remains one more formula in Chapter II for which 
we have not yet found a parallel in' this chapter, i. e. formula VII. 
To obtain this parallel expression, we difference formula (XVI), 
thereby obtaining: 

ANx. + (-n-tl)A^D^ P, -1 

In formula (XIV^) we found a value for 


which when substituted in this last expression gives us: 

, „ r..*, . (yr*i) ^ 


Collecting coefficients we get 

nln^ a^d}ao„M 



436 


SYSTEMS OF POLYNOMIALS 


and by simplifying the coefficients this expression finally reduces 
to the formula 


(XVII) 







r ^ 


_ ^ J 

y 




a relation which is also similar in form to formula VII. 

Before concluding this chapter, we might examine the char¬ 
acter of the polynomials ('n,x) when the original func¬ 
tion is the Poisson exponential function . 




We find these polynomials to have the following form: 


1 ^xe ^ 


x' 

-J—aL 



x! 




xl 



f(xr 

xl 




cz) 


= (x-n) , 


Substituting the proper values for N^ and in formula 
(XIV;7) we get 

Az\ Q7y(r?,Z)-^('A-X-f-77-lM(P^/77yX)^ ^ Oyj ^= O 




E, H, HILDEBRANDT 


437 


In the same way we find for formula (XI the relation 


and for formula (XVII), the reduced relation 

/ nx), 


which is somewhat like the relation obtained for (XIII ^). 

We might call attention to the fact that these polynomials are 
identical with the polynomials obtained by Charlier^ satisfying the 
relations 



x! J.«, 


* 0 for y77/ T7 
= 1 for 777® r? 


9. Summarizing the results of this chapter, we have found 
that if the general solution g {X’) of the difference equation 

——z-r— T 

is used as the generating function g {x) in the Charlier Type B 
series, that the successive difiFerences give rise to two general 
types of pol)momials which we defined as follows: 

and 

With the aid of the properties of the A operator, we derived 
a set of relations and equations for these pol 3 momials of the fol¬ 
lowing form: 


V. L. Charlier: “Ueber die Darstelltmg willkurlicher Funktionen,” p. 34. 



438 


SYSTEMS OF POLYNOMIALS 


(XI) (X)^CN^*DJA 0„ (x). 




(XII) 


Qr,^jM = ^I^X*r7 - M 


(XII') 

(XIII) AQ^M= ■n\AN^-^-^A^D^Q„,jrxW, 


(XIII^).aC>„C^,;c> n\dN^-^^A^D^ <P^j 


(XIV) 


(XIV„) 


(XVn) 


ov;./ -rrr-JMjq] a; 

- 7i[aN^-^^A%] (?„ rx)^o, 

-v\aN^* Q^ 

<Pn*i (h^) 

■h 77\AN^-f-{rfk)A^I>^ (N^*^x)Qr,-l^^>^*0, 


MJnz), 



E. H. HILDEBRANDT 


439 


(XVII) ^ \an^ Qr, M 

Each of these formulas corresponds and is similar to a for¬ 
mula found in Chapter 11. In fact, it seems probable that if we 
developed the formulas in this present chapter from the equation 



and permitted the A^. to approach zero as a limit, the formulas 
of Chapter II would result, the above formulas being the case 
where Ay^ = 1. 




A NEW FORMULA FOR PREDICTING THE 
SHRINKAGE OF THE COEFFICIENT 
OF MULTIPLE CORRELATION 


By^ 

Dr. R. J. WikERRY 

Cumberland University, Lebanon, Tennessee 


With the perfection of the Doolittle Method for the solution 
of the constant values necessary for the multiple correlation and 
prediction technique, we may expect a constant increase in the 
use of this method in statistical practice. Theoretical statisticians 
have recognized for some time however that the multiple correla¬ 
tion coefficient, derived from a large number of independent vari¬ 
ables, is apt to be deceptively large due to chance factors. When 
prediction equations derived in this manner are applied to sub¬ 
sequent sets of data, there is apt to be a rather large shrinkage 
in the resulting correlation coefficient obtained, as compared with 
the original observed multiple correlation coefficient. In order 
to avoid over optimism it is necessary to have some equation 
which will predict the most probable value of this shrinkage. The 
development of such a formula is the purpose of this paper. 

The most promising formula of this type so far developed 
is ihe B. B. Smith formula, presented by M. J. B. Ezekial at the 
December, 1928, meeting of the American Mathematical Society 
held at Chicago. This formula is 
_■? 1 - 15 ^ 



R, 7 . WHERRY 


441 


where the estimated correlation obtaining in the universe 

P = the observed multiple correlation coefficient 
M=^ the number of independent variables 
A/~ the number of observations (the statistical popula¬ 
tion). 

This formula was evidently developed by B. B. Smith by an 
application of the method of least squares as follows (the deriva¬ 
tion is that of the author, since he could not find it given else¬ 
where) : 

The customary formula for the coefficient of multiple corre¬ 
lation may be written in the form 

( 2 ) 1 -^ 

0‘q 


where 

(3) 

where 


N 


The method of least squares, however, says that the most 
probable value of the standard error of estimate is not that given 
in equation (3) but 



Now, if we substitute the value of (5) in place of (3) in 
equation (2), we have at once 


iSee Merriman, Method of Least Squares, John Wiley & Sons, London, 
8th Edition, pp. 80-82. Also see derivation later in this paper. 



442 


FORMULA FOR PREDICTING SHRINKAGE 


( 6 ) 


_-e 

7? 




N-M 


and since , by (2) above, we have 
we have 


% 


equal to (1 


(7) 


;p = /- 


N- M 


-1- 


i-P 


z 



which is, exactly, the B. B. Smith formula (1). 

This formula has been widely used during the last few years, 
but up until recently had not been subjected to much critical ex¬ 
amination. However, in a recent article in the Journal of Educa¬ 
tional Psychology\ S. C. Larson actually tested the formula em¬ 
pirically on some data detained from the Mississippi Survey con¬ 
ducted by M. V. O'Shea, obtaining the results indicated in the 
tables and graphs below, and on the basis of which he reached 
the following conclusion: 

‘‘The Smith Shrinkage-Reduction formula parallels all of 
the empirical findings but quite consistently gives values whi^h 
are in excess of those obtained under present experimental con¬ 
ditions.” This meant that the Smith formula predicted shrinkages 
consistently greater than those actually obtained. 

It was in view of this reported empirical difference that the 
writer started his attempt to derive the Smith formula and hit on 
the method given above. The question at once arose in the writ¬ 
er's mind as to why, when the standard error of estimate had been 
corrected to correspond to the most probable value by a least 
squares criterion, the standard deviation of the dependent variable 
had not been treated in the same fashion. 


i“The Shrinkage of the Coefficient of Multiple Correlation,” Jan., 1931, 
pp. 45-55. 



R. J. WHERRY 


443 


Merriman, whose formula we used above in correcting the 
standard error of estimate (5), likewise, and by identical reasoning, 
shows that the most probable value of the standard deviation of the 
dependent variable existing in the universe, should really be rep¬ 
resented by the following relationships: 

Where 


( 8 ) 



we find 

(9) 





N-1 





which reduces formula (6) to the form 

2 J±- 

(10a) 

N'l 

and 'when the same substitution is made as in step (7) above, we 
have 

_2 ( N-l)Te^-(M-l) 

(10b) J3 = /sj-M 

which is, by a more correctly applied criterion of least squares, 
the formula we have been seeking, and is a closer approximation 
than that given by the Smith formula. 

The reasons for the substitutions made above in our formulae 
may not be entirely clear to all readers, so we now present the der¬ 
ivations of the formulae given in (5) and (9) above. The deriva¬ 
tions given here are directly adapted from those of Merriman 
referred to above, but have been translated into the customary 
statistical notation whenever possible. 

First, let us consider the derivation of the value in (9), As 



444 


FORMULA FOR PREDICTING SHRINKAGE 


stated in (8) the most customary form 4>f Sigma is 


( 8 ) 



N 


where 


( 11 ) 

Each value has a certain error, however, due to the 
fact that the value of the mean is merely the most probable value, 
not the true value. So for each value there is a small un¬ 
known error , so that if we take to be the true value 
of a deviation we have 


( 12 ) 

and, squaring and summating, disregarding the terms involving 
second power delta terms as small in comparison with the first 
power terms, we have 

(13) Z. ZZ <5^, 

Now, by the laws of probability, we know that the probabil¬ 
ity of the occurrence of an error , whose measure of pre¬ 
cision is “h,” is 

(14) Jl^hdxrr ^ 

multiplying both sides of this equation by 5?“* and summating 
between the limits plus and minus infinity, we have 



445 


R. J. WHERRY 

SL 2Z 

and since ZJJT£ is the same as 7 ^ — , since in our work we 
assiime the weight of each value 2 , for each of the N observa¬ 
tions, to be 4-i y we have 
N 


( 16 ) 


1 

N 


or 

(16a) 


Likewise, if we let 


(17) 


<5Xg 


the probability of the system of errors, , is 

(18) VL'duTT'^ 

and the mean of all of the possible values of is 




(19) 


- f 

irst. J 


u & da^^fjZ 


and this must be taken as the best attainable value of u. . But it 
was shown that the quantity is equal to (16). 

Hence 

( 20 ) a, 

from which 






( 9 ) 



446 


FORMULA FOR PREDICTING SHRINKAGE 


which was to be proved. 

To derive (5) we proceed in much the same manner. After 
our normal equations have been solved for the most probable 
values of , ■ ■ • ' /^crr, for our set of data, 

we know that these are not the true values,but that they err by 
small unknown corrections <5^/, * ’ ' ’ 

the corresponding true values for the universe being 

(/?;8 )» )> • • • (/?777 ^/^ orr ? )• 

Now, if we substitute the most probable values of the Betas 
in our original observation equations, they will not reduce to zero, 
bus will leave small residuals , thus 



' ^rnf^ox * ^ 

-^Zg ^ ^03 ^3g ' 



• • 1^/3 ^ V 

' Om 7r7/v ^ 


while if the corresponding true values be substituted, we obtain 


(A>1 \ 





R, J, WHERRY 


447 


Subtracting each of the former equations from the latter, we obtain 


Now the principle of least squares provides that H shall 
be made a minimum to give the most probable values of ^ , 

, . . . , and by the solution of the normal 

equations by the Doolittle method its minimum value is found to 
be H, . From the residual equations we may find the re¬ 
lationship existing between the values H and ZJ . Thus, 
if we square each equation immediately above and then summate 
we have (if we neglect squares and products of the delta values 
as small in comparison with the first powers): 


( 21 ) 

which we may write as 

( 22 ) L .^ Z 

Now, by analogous reasoning to that in steps (14), (15), and 





448 


FORMULA FOR PREDICTING SHRINKAGE 


(16), we may set 
(23) Z 

Further, if there be but one independent variable, there will 
be but one and its value ly the same process 

used in steps (18) and (19) can be shown to be 


(24) 



and since that is true whichever unknown quantity be considered, 
the values of each value must be \ and as there 

are of these values the above equation (22) becomes 


z:v^ 


n 

* 17 ^ 


N 


from which 
(25) 




N- M 


ZZ 


Therefore, from the constant relationship which exists be¬ 
tween the value h and the Probable Error, we have 


(26) 


PE. =0.6745 


J N- 


M 


^nd therefore, by the relationship existing between the probable 
error and the standard deviation we have at once 


(S) 



N~M 


which was to be proved. 

The next step was to test out the formula empirically. 
This was done by using Larson’s material, with the results in¬ 
dicated in the tables below, and in the graphs which show the same 



R. J WHERRY 


449 


set of facts, but which make the results much more apparent. 

An inspection of the tables and graphs will show at once 
that the new formula predicts what will actually happen much 
more accurately than the Smith formula did. In graph 1, for 
example, the agreement is so good that the results appear almost 
to have been a regression line fit to the particular set of data. 

It was to have been expected that if the formula actually 
predicted the most probable values of the correlations obtaining 
in the universe that the errors incurred by the use of the formula 
would be normally distributed around zero as a mean value. Graph 
3 presents a comparison of the error curves obtained by use of 
the Smith and the “Wherty prediction formulae, together with an 
approximation to the normal curve. As a further and more sci¬ 
entific check the criteria for a normal curve as set forth by Rietz^ 
were applied to the data. His criteria are 

^ 1 - 0 ,where 
z ^ 

The results for the two formulae are given below. 


(Results based on an expectancy of zero) 



Smith Formula 

Wherry Formula 

Ml 

.00138 

.00038 


.223 

.025 


3.004 

3.703 


'Rietz, H. L. Mathematical Statistics, Caras Mathematical Monograph 
No. 3, Mathematical Association of America, Chicago 1927, pp. 58-59. 









450 


FORMULA FOR PREDICTING SHRINKAGE 


It is apparent therefore that the Wherry formula gave much 
better results for both the first criterion (mean error) and the 
second criterion (skewness), but that the excess was greater for 
the Wherry formula than for the Smith formula. However, one 
cannot quarrel too much with getting errors actually smaller than 
would be expected by assuming normality. Even this superiority 
is seen to be fictitious if the distributions are measured from 
their own means rather than from an expected mean of zero. 
When this is done, which is the manner in whicji the criteria are 
customarily used, we have 


(Results based on means of distributions) 



Smith Formula 

Wherry Formula 

Ut 

.000 

.000 

/3s 

1.712 

.025 


5.524 

3.753 


Thus,we find that the Smith distribution has, in reality, even 
a greater excess than does the Wherry formula, but has it at a 
point farther removed from the desired value. 


SUMMARY AND CONCLUSIONS 

1. Larson has shown that the theoretically expected shrink¬ 
age is an empirical fact. 

2. Larson has shown that the Smith formula, when tested 
empirically, consistently over-estimates this shrinkage as deter¬ 
mined empirically. 

3. It has been demonstrated that the new Wherry formula, 



R, J, WHERRY 


451 


both by a least squares criterion and by actual application, is more 
nearly true than the corresponding Smith formula. 

4. The correct formula for the shrunken coefficient of mul¬ 
tiple correlation is 

(M-1) 

^ ^ - 

N- M 


where 

«= 

the 



the 


M = 

the 

and 

/V = 

the 


estimated correlation obtaining in the universe 
observed coefficient of multiple correlation 
number of independent variables 
number of observations (statistical population). 




Shrinkage tn R 


452 


FORMULA FOR PREDICTING SHRINKAGE 


GRAPH 1. 

Shrinkage as Obtained by Use of the Formulae and Also as 



Number of Variables 



Shrinkage in R 


R. L WHERRY 


453 


GRAPH 2 

Shrinkage as Obtained by Use of the Formulae and Also as 
Obtained Experimentally 

(Data from Table II) 



Number cf Voi tables 



Cumulated Frequency 


454 


FORMULA FOR PREDICTING SHRINKAGE 


GRAPH 3 


Ogive Showing the Distribution of Error in Predicting Shrinkage 



Magtiitude of Error 







Showing the Actual Shrinkage in Q Found When the Prediction Equation Found on One Group of Sub- 


R. J. WHERRY 


455 



♦The article by Larson reported the values for the Smith formula errone¬ 
ously, due to a misconception of the meaning of m* Those in the present 
tables are the correct values. 

























TABLE II 


456 


FORMULA FOR 









































R, L WHERRY 


457 


TABLE III 


Showing the Mean Error Attained by the Use of the Smith and 
Wherry Shrinkage Formulae. 


Formula 

Table I 

Table II 

Tables I and 11 

Smith 

.00097 

.00180 

.00138 

Wherry 

.00018 

.00057 

.00038 



















THE USE OP THE RELATIVE RESIDUAL IN 
THE APPLICATION OF THE METHOD 
OF LEAST SQUARES 


By 

Walter A‘‘ Hendricks 
Junior Biologist, Bureau of Ammal Industry, 
U, S, Department of Agriculture, 


The method of least squares offers a precise method of fitting 
a curve describing the relation between two or more related, meas¬ 
urable variables, but certain criteria must be fulfilled to justify its 
application. First, the type of equation selected for fitting must 
be the true mathematical expression of the law governing the rela¬ 
tionship of the variables. Secondly, all error-? of measurement, 
made in obtaining the observed values of the variables when the 
data were collected, must be distributed according to the well- 
known laws of probability.^ 

This paper is concerned with the latter of these two criteria. 
The fundamental theory upon which the method of least squares 
is based can be found in any text-book on the subject and need 
not be elaborated upon here. However, it may be well to point 
out a very pertinent, if somewhat elementary, aspect of the theory 
which facilitates the ready visualization of the fundamental con¬ 
cepts involved. 


iSteinmetz, C P Engineering Mathematics McGraw-Hill Bode Co, 
New York (iM7). 



W. A. HENDRICKS 


459 


The application of the method of least squares to curve fit¬ 
ting, as ordinarily described in works on the subject, is perfectly 
analogous to the calculation of the arithmetic mean of a number 
of measurements made upon a single, constant quantity. This 
may be easily demonstrated as follows: 

Let Y" * f (X) describe the relation existing between an 
independent variable, X , and a dependent variable, Y*. If it is 
desired to find the most probable value of the dependent variable 
when X has some definite value, Xjj, the most direct method of 
procedure would be to make a number of measurements of Y 
at this value of X and calculate their arithmetic mean, provided, 
of course, that the errors of measurement were distributed accord¬ 
ing to the laws of probability in a normal frequency 'distribution. 
According to the elementary theory of statistics, the most prob¬ 
able value of the dependent variable, ffXx) , would be such 
that the sum of the squares of the deviations of the actual meas¬ 
urements from this value would be a minimum. 

If X is conceived to be varying in value so rapidly that it 
is impossible to make more than o-ne measurement of Y at any 
value of X , this direct method can not be employed. However, 
the most probable value of ffXi^ can still be determined. 
Let Xi » , Y3» - • • Xyj each represent a measured value 

of Y at values Xj , , X^, ... . X77 , respectively, of 

the independent variable. Since the errors of measurement are 
assumed to be distributed according to the law of chance, an 
error of a given magnitude is equally likely to occur at any value 
of X . In other words, exactly the same errors would be made 
in obtaining one measurement of each of the quantities > 

, . . . Y77asif ffXi) were measured 77 times. These 
errors may, therefore, be considered as having been made in meas¬ 
uring a single, constant quantity. Therefore, if f fX) denotes 
the most probable value of Y at any value of X and Y de¬ 
notes the corresponding observed value, the most probable values 
of the dependent variable which can be calculated from any set of 



460 


USE OF THE RELATIVE RESIDUAL 


data are such that the sum of the squares of the differences, 
fCX.)- Y » is a minimum. 

It is important to bear in mind that this conception of the 
distribution of errors of measurement is justified only when an 
error of a given magnitude is equally likely to occur at any value 
of X . In actual practice it often happens that this ideal condition 
is not realized. The magnitude of the errors of measurement is 
often influenced by the magnitude of the quantity which is being 
measured. In obtaining the live weights of animals at different 
ages, for example, it is common practice to use a less delicate 
balance in making the weighings as the animals become larger, and 
the magnitude of the errors of measurement increases as the 
sensitivity of the balance decreases. Other factors which tend to 
increase the magnitude of the errors may also be in opeiation. 
The error, or rather the unreliability, of the weight of a 1,000 
pound steer would be greater than that of a 100 pound calf, even 
though an equally sensitive balance were used in making both 
weighings, because of a greater content of material in the digestive 
tract and excretory organs and the increased effect of the move¬ 
ments of the animal. 

It is highly probable that in many fields of investigation such 
disturbing influences are encountered more frequently than the 
ideal conditions which justify the application of the method of 
least squares as ordinarily described. 

Pearl and Reed recognized the need for modifying the ap¬ 
plication of the method of least squares to compensate for changes 
in the probability of the occurrence of an error of any given mag¬ 
nitude and suggested, as stated by Pearl,^ that it would be more 
logical in many instances to employ residuals of the type — 

The use of such residuals was based on the assumption that if 
the errors of measurement were expressed as percentages of the 

iPearl, Raymond. Studies in Human Biology. WiUiams & Wilkins, Bal¬ 
timore (1924). 



fF. A, HENDRICKS 


461 


magnitude of the quantities measured, the percentage errors would 
be distributed at random according to the law of probability. In 
many practical problems this assumption appears to be justifiable. 

The study herein reported was made to determine the extent 
of the error made when the method of least squares as ordinarily 
described is applied to data in which the percentage, rather than 
the absolute errors of measurement are distributed according to 
the law of chance. 

The writer desired a hypothetical set of errors of measure¬ 
ment which, when expressed as percentages of the quantities meas¬ 
ured, would come as near as possible to forming a normal fre- 
•quency distribution. 


TABLE I 

Ideal Frequency Distribution of 41 Throws of 12 Dice in Which 
a Throw of 4 , S, or 6 Points Is Considered a Success. 


SUCCESSES 

FREQUENCY 

2 

1 

3 

2 

4 

5 

5 

8 

6 

9 

7 

8 

8 

5 

9 

2 

10 

1 

Total 

41 



462 


USE OF THE RELATIVE RESIDUAL 


Millfi i gives the results of fitting a normal frequency curve 
to Weldon’s distribution of 4096 throws of 12 dice, described by 
Yule,* in which a throw of 4, 5, or 6 points was considered a 
success. If each frequency, calculated from the fitted curve, is 
divided by 100 and the results rounded off to whole numbers, 
the frequency distribution given in Table I is obtained. 

If hypothetical errors of measurement are substituted for 


TABLE II 

Ideal Frequency Distribution of 41 Hypothetical Percentage 
Errors of Measurement. 


ERROR 

(Per cent of quan¬ 
tity measured) 

FREQUENCY 

+ 8 

1 

+ 6 

2 

^ 4 

5 

+ 2 

8 

0 

9 

- 2 

8 

- 4 

5 

- 6 

2 

- 8 

1 

Total 

41 


^Mills, F. C Statistical Methods Applied to Economics and Business. 
Henry Holt & Co., New York (1924). 

*Yul^ G. Udny. Introduction to the Theory of Statistics. Charles Griflhi 
& Co., Ltd., London (1927). 










W A. HENDRICKS 


m 


successes in this frequency table, the resulting distribution may be 
considered to represent a distribution of random errors of meas¬ 
urement which might be made in obtaining a series of 41 meas> 
urements of a variable. The most probable error should obviously 
be zero. If the total range in magnitude of the errors is assumed 
to be from -/- 8 per cent to -8 per cent and the precision of meas¬ 
urement is such that each error differs from the next larger or 
smaller error by 2 per cent, the distribution of these h 3 qx)thetical 
errors‘of measurement should be as given in Table II. 

From the simple equation, V=100X^, 41 values of 'Y were 
calculated, using values of K from 1 to 41, inclusive. Each 
calculated value of V was then changed by algebraically sub¬ 
tracting the hypothetical errors of measurement given in Table II. 
All the percentage errors of each magnitude were arbitrarily dis¬ 
tributed as uniformly as possible throughout the data. These altered 
values of Y will hereafter be termed the “observed” values and 
the original values, from which they were calculated, the “true” 
values. The observed values of Y, together with the true values 
and the assumed errors of measurement from which they were 
calculated, are given in Table III. 

In order to be certain that the errors were actually distributed 
in such a manner that the probability of the occurrence of a per¬ 
centage error of any given magnitude was the same at all values 
of X , the writer employed Pearson's method of square contin¬ 
gency as described by Yule,^ A 16-cell contingency table was 
constructed in which the percentage errors were classified accord¬ 
ing to the values of X at which they occurred. The chi-square 
test for contingency was applied to this table. 

Table IV shows the actual distribution of the percentage 
errors, together with the corresponding theoretical frequencies. 
Since there are 4 rows and 4 columns of cells in the table, the 
number of algebraically independent differences between theoret- 


^Loc. cit. 



Calculation of the Observed Values of Y" from the TrueValues. 


USE OF THE RELATIVE RESIDUAL 




44528 

52900 

59904 

63750 

67600 

74358 

73696 

82418 

86400 

99944 

100352 

108900 

113288 

124950 

129600 

131424 

153064 

155142 

160000 

174824 

ERROR 

Actual 

Units 

+ 3872 

0 

-2304 

- 1250 

0 

- 1458 
+ 4704 
+ 1682 
+ 3600 
-3844 
+ 2048 

0 

+ 2312 
-2450 

0 

+ 5476 
-8664 
-3042 

0 

-6724 

Per 

cent 

00 O + M O <N'ON + + O N CM O + VO N O ■+ 
+ II I + ++ I+ t-l til 1 

too A'® 


48400 

52900 

57600 

62500 

67600 

72900 

78400 

84100 

90000 

96100 

102400 

108900 

115600 

122500 

129600 

136900 

144400 

152100 

160000 

168100 



CNJ lO VO ^S 00 0\ O Cvi CO VO 00 Os O T-4 

CvJ 04 CM CM CM CO CO CO'O CO CO CO CO CO CO ^ tJ- 



OOOOvQtOT-iOQ^Qi-Hts.ioQtoOCMOCMQi-H 

T-iCO^vCcooOOQ’^CM(^'Oa\CM04asSOCOrHOQQCO 

»-irMco'^vooOOv»-iM-OOC^CMiooOr-4VOQCM 

r-«iF-<rH,-H04CMCMCOcO'^M“ 

ERROR 

Actual 

Units 

CMVOO;^OvO00Oj:MOM-^04CMOO00:p04OCM 
r-H vO^T-(0\ VO O 00 0^ lo Os U-) Os CM 00 

r-iCM *-hCM'«T*OcoCOM“ lOCMts, 00 

»«H rH 

1+ i+t+ I++II+I ++I + 

Per 

cent 

N+-O + 'Ov 0NONN + + 00NNON + NON 
1+ 1+1+ It+ll+l ++I + 

>< 

0 

Q 


T—l'rHr-41-HrHCMCMCMCOCO^'^ 

K 


T-4CMcO’<:|-mvot^OOOsO*-<CMcO’^iovot>^OOc3sQ»-' 

*“4 ^ ^ ^ rH rH rH ir-4 t—t rH CM 







W .A. HENDRICKS 


465 


ical and observed frequencies is (4 - 1) (4 - 1)-/• 1 or 10. The 
value of calculated from the data in Table IV, is 1.3171. The 
corresponding value of P , which is the probability that as bad, 
or worse, an agreement between observed and theoretical fre¬ 
quencies could occur from the fluctuations of random sampling 
is, according to Pearson’s Tables,*^'0.996911 or almost certainty. 
The percentage errors were, therefore, distributed in such a man¬ 
ner as to be uncorrelated with the values of X. at which they 
were used. 

The equation, Y^AX , was fitted to the hypothetical set 
of data in Table III by the method of least squares as ordinarily 
described. If represents a calculated value of the depen¬ 

dent variable and Y represents the corresponding observed value, 
the difference between these two values is Y and the 

square of the difference is , The sum 

of the squares of all the differences is A^X 0 X*''£AILXfY'^XY^. 
The value of this expression will be a minimum when its deriva¬ 
tive with respect to A is equal to zero. Differentiating and 
equating to zero yields the following equations for the determina¬ 
tion of A: 

( 1 ) 2AEX^-2ZX^Y-0 

( 2 \ 

The value of A calculated from the data in Table III by 
means of equation (2) is 100.6250. 

If residuals of the type suggested by Pearl and Reed are 
employed, A is calculated as follows. Let AX^ represent 
a calculated value of the dependent variable, as before, and let Y 
represent the corresponding observed value. Then the difference 
between the two values, expressed as a fraction of the observ^ed 

^Pearson, Karl. Tables for Statisticians and Biometricians. Cambridge 
University Press, London (1924). 



466 


USB OP THE RELATIVE RESIDUAL 


TABLE IV 


Chi-square test for contingency applied to the distribution of the 
percentage errors of measurement. The theoretical frequencies 
for each compartment are given in parentheses. 



Ma^tude of Error (Per cent) 

[H 

0.0 to+19 

±2.0 to ±3.9 

±4.0 to ±5.9 

± 6.0 and over 

Total 

1 to 10 

2 

4 

2 

2 

10 


(2.1951) 

(3.^24) 

(2.4390) 

(1.4634) 


11 to 20 

2 

4 

' 3 

1 

10 


(2.1951) 

(3.9024) 

(2.4390) 

(1.4634) 


21 to 30 

2 

4 

2 

2 

10 


(2.1951) 

(3.9024) 

(2.4390) 

(1.4634) 


31 to 41 

3 

4 

3 

1 

11 


(2.4146) 

(4.2926) 

(2.6829) 

(1.6097) 


Total 

9 

16 

10 

6 

41 


1.3171 

77 ’ - 10 

P • 0.996911 





































W .A, HENDRICKS 


467 


value, is ^ or - > The square of this relative 

deviation is ^ 1 and the sum of the squares of the 

41 relative deviations is -‘2AZ"^ ^ 41. This expres* 

sion will likewise have its minimum value when its derivative with 
respect to A is equal to zero. Differentiating and equating tc 
zero, as before, leads to the following equations for the deter¬ 
mination of^ : 

(3) 



Applying equation (4) to the given set of data gives a value 
of 99.7573 for A . This value of A. is closer to the true value 
100, than the value which was calculated by means of equation 
(2) but the improvement was not as great as might be expected. 

It occurred to the writer that if the deviations of the cal¬ 
culated, from the observed, values of the dependent variable were 


expressed as fractions of the calculated values, a more accurate 
value of A could be obtained. .aX^-Y 

The relative deviation expressed in this manne^is 
or i . The square of this deviation is 7- 

and the sum of the squares of the 41 relative deviations is 

41 ’Differentiating this expression with 


respect to A and equating to zero yields the following equations 
for the determination of A : 




( 6 ) 



468 


USE OF THE RELATIVE RESIDUAL 


The value of Ji, calculated from the data by means of equa¬ 
tion (6), is 100.1210 which is nearer to the true value than either 
of the values calculated by the two preceding methods. However, 
it is evident that equation (6) failed to give results as precise as 
one would expect, in view of the method by which the observed 
values of Y were obtained. 

The reason for this discrepancy can be made most apparent by 
returning to the analogy existing between the application of the 
method of least squares to curve fitting and the calculation of the 
arithmetic mean of a number of measurements of a single, con¬ 
stant quantity. 

Let mj, 777j, . . . represent measured values 
of the same constant quantity and let their arithmetic mean be 
represented by ^ . If each measurement is divided by the arith¬ 
metic mean of all the measurements, the resulting distribution of 
these relative values will be normal if the original measurements 
were distributed normally. The arithmetic mean of these relative 
values will obviously be unity. 

Let . . ^^represent the relative values of 

the measurements. The arithmetic mean of these values is unity. 
Therefore, the deviation of any relative value, , from the 
mean is 1 - ^ . 

Let it be assumed that the value of the arithmetic mean of 
the original measurement, M , is unknown and is represented 
by Z . Then any measurement, m , expressed as a fraction of 
Z , is , According to the discussion in the two preceding 
paragraphs, it might appear that Z must have such a value that 
the sum of the squares of the deviations, 1 - ^, is a minimum. 
However, this is not the case. It may be demonstrated that the 
value of the expression 23 A a minimum when 

Z has some other value than the arithmetic mean of the original 
measurements. The sum of the squares of the residuals may be 
written, Z m-fZ 777 ^ Differentiating this expres¬ 

sion with respect to Z and equating to zero yields the following 



W .A, HENDRICKS 


469 


equations for the determination of ^ : 

(7) 2Z 


( 8 ) 


Z* 


Zrr? 


The value of Z , calculated by means of equation (8), is 
obviously not the arithmetic mean of the original measurements. 
The fallacy in the deduction of this equation is readily apparent. 

Instead of using residuals of the type, 1" ^ , and differ¬ 
entiating the sum of the squares of the residuals with respect to 
Z , one should use residuals of the type, XT’- ^ , in which V 
represents the arithmetic mean of the relative values, , of the 
measurements. The sup of the squares of the residuals should 
be differentiated with respect to V . The square of the residual, 
^ , is and the sum of the squares of all 

the residuals may be written vm -h ^ Z 
Differentiating with respect to V and equating to zero yields 
the following equations for the determination of Y : 

( 9 ) Zm^O 

( 10 ) 

f? 

Since the value of V is known to be unity, equation (10) 
may be written: 

( 11 ) n^'^Zm 

from which Z may be readily calculated as follows: 



470 


USE OP THE RELATIVE RESIDUAL 


( 12 ) 


-r 

2.^ -w 


Equation (12) is obviously nothing more than the simple 
formula for the calculation of the arithmetic mean of the orig¬ 
inal measurements, which is sufficient evidence that the reasoning 
involved in its deduction is sound. 

It is now readily apparent why equation (6) did not )deld 
results which were consistent with the data in Table III. The 
ratio, ^ _ , is analogous to the ratio, ^, and residuals of the 
type, > should have been used in fitting the equation 

instead of residuals of the type, 1 } The square of the 

residual, , is + a^ ' x ^ * sum of the 

squares of the 41 residuals is ^ ^ ^ H . 

Differentiating this expression with respect to V and equating 
to zero yields the following equations for the determination of / ; 


(13) 

( 14 ) 

Substitutii^r the known value, unity, for V in equation (14) 
yields the following equations for the determination of A . 


(15) 

( 16 ) A = 


^Residuals of the type, - 1, are analogous to those of the type, 
Trj * 1» which also lead to incorrect results. 



W A. HENDRICKS 


471 


Applying equation (16) to the data in Table III gives A a 
value of 100,0000, which coincides exactly with the true value from 
which the data were originally calculated. Equation (16) was, 
therefore, the correct equation to use in interpreting the data given 
in Table III. Although the use of residuals of the types, — 1 
and 1 - , gave better approximations to the true values of 

A than the use of the simple residuals, neither of 

the two gave results which were entirely in accord with the deriva¬ 
tion of the data. 

Yule^ suggested that the geometric mean might often prove 
useful in comparing the frequency distributions of diiferent sets 
of data, in which the dispersion of the individual measures about 
their means was influenced by the magnitude of the means. It 
appeared to the writer that the use of residuals of the type, 
log AX- log Y, might give a good approximation to the true value 
of A in fitting the given equation. It is evident that the ratio, 
, approaches unity as the residual, log AX-log Y, ap¬ 
proaches zero. 

This logarithmic residual may be written, log A -^2 log x- 
log Y , and its square is (log /4 )^ ^ 4(logJ5r )^-^^ (log Y)^ 
^ HlogA)(logX) - 2(log^ )(log Y)-HlogX)(logY). 
The sum of the squares of the 41 residuals is 41 (log A Y 

(logx f * i:(\ogY f_j_ 4(iog^ ) z (logX) 

- 2(log A) Z (log Y) - 42 (log X . log Y )• Differentiat- 
ing this expression with respect to log-/4 and equating to zero 
yields the following equations for the determination of A ; 


(17) e^C\Q%'A)■^4ZC\o^,X)-£Z(\ogY)=0 


(18) 


. (yygY)-zznogx) 
iog^= ^7 


iLoc. dt. 



472 


USB OF THE RELATIVE RESIDUAL 


The value of logA, calculated from the given set of data by 
means of equation (18), is 1.9997369, which gives a value 
of 99.9394. This value of comes closer to the true value than 
those calculated by means of residuals of the types, 
and “7^ - 1. However, since the use of the geometric mean 
is not rigorously justified when the distribution of the measures 
about the arithmetic mean is symmetrical, the use of logarithmic 
residuals in curve fitting can not give precise results when the 
errors of measurement are distributed as they were in the given 
set of data. 

In any application of the method of least squares to a prac¬ 
tical problem, the procedure of the investigators should be gov¬ 
erned by the nature of the data to which it is being applied. In 
many instances the correct procedure can be deduced by a careful 
consideration and evaluation of the accuracy of the methods of 
measurement used in obtaining the data. Unfortunately, however, 
some sources of error are not always readily apparent at the time 
the data are col^Iected, and occasionally can not be quantitatively 
estimated even though they are known to exist. If the nature of 
the mathematical relationship existing between the dependent and 
independent variables is known, all that remains is to find the 
most probable values of the constants in the equation. 

A statistical study of the deviations of the observed values 
of the dependent variable from the corresponding calculated val¬ 
ues, obtained after fitting the equation by several different meth¬ 
ods, may be of much help in deciding which method of fitting was 
most consistent with the nature of the data. For example, Table 
V gives the results of applying the chi-square test for contingency 
to the distribution of the deviations of the observed values of 
from the calculated values obtained when residuals of the type, 
r , were used in fitting' the equation, to the 

data in Table III. The value of P is only 0.005061 and a mere 
inspection of the table itself shows that large deviations tend to 
occur more frequently, and small deviations less frequently, as 



ff' ui. HENDRICKS 


473 


TABLE V 


Chi-square test for contingenqr applied to the distribution of dje 
deviations of the type, The theoretical frequencies 

for each compartment are given in parentheses. 


Value of 

X 

Magnitude of Deviation 

10 to 

± 1999 

±2000 to 
± 3999 

±4000to 
± 5999 

±6000 and 

over 

Total 

1 to 10 

10 

0 

0 

0 

10 


(7.3171) 

(1.2197) 

(0.9756) 

(0.4878) 


11 to 20 

10 

0 

0 

O' 

10 


(7.3171) 

(1.2197) 

(0.9756) 

(0.4878) 


21 to 30 

6 

1 

j 

3 

0 

10 


(7.3171) 

(1.2197) 

(0.9756) 

(0.4878) 


31 to 41 

4 

4 

1 

2 

11 


(8.0488) 

(1.3415) 

(1.0732) 

(0.5366) 


Total 

30 

5 

4 

2 

i 

41 


- 23.5989 
77' = 10 
P -0.005061 








































474 


USE OF THE RELATIVE RESIDUAL 


the values of X increase. If the true nature of the values of 
Y in Table III were not knoVn in advance, this distribution of 
the deviations would be sufficient evidence that the method of 
fitting the equation was not consistent with the accuracy of the 
measurements made when the data were collected. 

Tables VI, VII, and VIII give, respectively, the distributions 
of tire deviations of the types, 

log AX^-\og Y , when the corresponding residuals were used 
in fitting the equation.^ The value of P is high in each case, in¬ 
dicating that, although the use of residuals of these types did not 
give results which were precisely accurate, nevertheless, they 
yielded values of A which were well within the limits of the 
probable error to be expected in any practical investigation. 

As a matter of fact, this is a rather fortunate circumstance, 
since the only method of fitting the equation given above which 
yielded exactly the correct value of A cannot be applied to 
fitting an equation containing more than one undetermined con¬ 
stant. The applicability of residuals of the types, 1 sind 

log fPfJ- log Y is also somewhat limited. However, any 
equation which can be fitted by the method of least squares at 
all can still be fitted when residuals of the type, ' 'y ~ 1, are 
employed. 


SUMMARY AND CONCLUSIONS 

The method of least squares can be a more valuable tool in 
statistical work when the fuddamental theory upon which the 
method is based is taken into consideration. The use of residuals 
of the type, is probably justified in fewer practical 


^The distribution of the deviations obtained when the equation was fitted 
to the data by means of equation (16) is identical with the distribution of 
the errors given in Table IV. 



JV J. HENDRICKS 


475 


problems than the use of residuals of some other form. The type 
of residual to be employed should be governed by the nature of 
the data to which the method of least squares is being applied. 

The use of relative residuals of the type suggested by Pearl 
and Reed may be of much value in many instances but v^l not 
give results which are precisely accurate, even though the dis¬ 
tribution of the percentage errors of measurement is strictly nor¬ 
mal. The results can be improved by expressing the deviations 
of the observed from the calculated values of the dependent vari¬ 
able as fractions of the calculated, rather than the observed, value.^ 

The use of logarithmic residuals may give more accurate 
results than the use of residuals of the type suggested by Pearl 
and Reed, even though the distribution of the percentage errors 
of measurement is normal. 

The chi-square test for contingency may be of much help in 
selecting the type of residual most consistent with the errors of 
measurement made in obtaining the data when sufficient informa¬ 
tion regarding the accuracy of the measurements is not available. 


'Residuals of this type have been used by Hendricks, Lee, and Titus at the 
U. S. Animal Husbandry Experiment Farm, Beltsville, Maryland, in the 
fitting of growth curves. 

Hendricks, W. A., A. R. Lee, and H. W. Titus. Early growth of White 
Leghorns, Poultry Sci. 8 (6); pp. 315-327 (1929). 

Titus, H. W., and W. A. Hendricks. The Early Growth of Chickens as a 
Function of Feed Consumption Rather Than of Time. (Conference Papers 
of the Fourth World’s Poultry Congress, Section B (Nutrition and Rear¬ 
ing) : pp. 285-293 (1930). 

The use of such residuals leads to results which appear to give a better 
description of the data than when simple residuals of the type, f{X) -Y, 
are employed. 




476 


USE OF THE RELATIVE RESIDUAL 


TABLE VI 


Chi-square test for contingency applied to the distribution of the 
* AX^ 

deviations of the type, y - 1. The theoretical frequencies 
for each compartment are given in parentheses. 


Value of 

X 

Magnitude of Deviation 

0.000 to 
± 0.019 

± 0.020 to 

± 0.039 

±0.040 to 
± 0.0S9 

± 0.060 and 

over 

Total 

1 to 10 

, 4 

3 

2 

1 

10 


(4.1463) 

(3.1707) 

(1.7073) 

(0.9756) 


11 to 20 

4 

4 

1 

1 

10 


(4.1463) 

(3.1707) 

(1.7073) 

(0.9756) 


21 to 31 

4 

3 

1 

2 

10 


(41463) 

(3.1707) 

(1.7073) 

(0.9756) 


31 to 41 

5 

3 

3 

0 

11 


(4.5610) 

(3.4878) 

(1.8780) 

(1.0732) 


Total 

17 

13 

7 

4 

41 


3.8182 
n' » 10 

P =0.921027 



































USB OF THE RELATIVE RESIDUAL 


477 


TABLE VII 


Chi-square test for contingency applied to the distribution of the 
deviations of the type, 1 • The theoretical frequencies 

for each compartment are given in parentheses. 


Value of 

Magnitude of Deviation 

X 

0.000 to 
±0.019 

±0.020 to 

10.039 

±0.040 to 
±0.059 

±0.060 and 
over 

Total 

1 to 10 

3 

4 

2 

1 

10 


(3.9024) 

(3.4146) 

(1.7073) 

(0.9756) 


11 to 20 

4 

3 

2 

1 

10 


(3.9024) 

(3.4146) 

(1.7073) 

(0.9756) 


21 to 30 

4 

3 

1 

2 

10 


(3.9024) 

(3.4146) 

(1.7073) 

(0.9756) 


31 to 41 

S 

4 

2 

0 

11 


(4.2927) 

(3.7561) 

(1.8780) 

(1.0732) 


Total 

16 

14 

7 

4 

41 


= 3.0984 
77 ' = 10 
p = 0.959091 

































478 


USB OF THE RELATIVE RESIDUAL 


TABLE VIII 


Chi-sqtmre test for contingency applied to the distribution of the 
deviations of the type, log AX^-log Y . The theoretical fre¬ 
quencies for each compartment are given in parentheses. 


Value of 

Magnitude of Deviation 

X 

0.000 to 
i0.009 

±0.010 to 
±0.019 

±0.020 to 
±0.029 

±0.030 and 
over 

Total 

ItolO 

6 

2 

2 

0 

10 



(2.4390) 


(0.4878) 


11 to 20 

6 

3 

0 

1 

10 

j 




(0.4878) 


21 to 30 

6 

2 

1 

1 

10 


(6.0976) 

(2.4390) 

(0.9756) 

(0.4878) 


31 to 41 

7 

3 

1 

0 

11 


(6.7073) 

(2.6829) 

(1.0732) 

(0.5366) 


Total 

25 

10 

4 

2 

41 


4.4989 

77 ' = 10 

P = 0.872945 































EDITOR’S NOTE 


It is with great pleasure that the Annals brings to its readers 
information concerning the Nordic Statistical Journal, edited by 
Dr. Thor Andersson. This publication is of great merit, and the 
work 'of its contributors compares very favorably with that found 
in Biometrika and Matron. Americans will do well to study care¬ 
fully the contributions which Scandinavians are making to statis* 
tical methodology. 



T?6rdic 

Statistical Journal 

EDITED BY 

THOB ANDEES80N 


VOLTJME 1 

FAQE 

index; . 6 

STATISTICS OR CHAOS . THE EDITOR 18 

STATISTICS AND LAROm MOVEMENT . A. THORBERO 33 

CORRELATION AND SCATTER IN STATISTICAL VARIABLES 

R. FRISOH 36 

INTERPOLATION IN STATISTICS . H. 0. NYB0LLB 108 

SOME REMARKS ON THE MEAN ERROR OF THE PERCENTAGE 

OF CORRELATION . <1. W. LINDEBERO 137 

SAMPLING . TOR JERNEMAN 142 

SOME REMARKS ON THE INCOME STATISTICS OP THE CEN¬ 
SUS IN SWEDEN IN 1920 . P. J. LINDERS 149 

THE AMPLITUDE OF INDUSTRIAL FLUCTUATIONS 

E. QJERMOE 165 

STATISTICS AND METEOROLOGY . A. ANGSTROM 228 

STATISTICS AND INSURANCE . THE EDITOR 236 

FBHR WILHELM WARGENTIN 1717—1783 N. V. E. NORDENMARK 241 

BILERT SUNDT 1817—1876 . N, RYQQ 258 

PIPERVIKEN AND RUSELOKBAKKEN. EILERT SUNDT 265 

T. N. TTTTELE 1838—1910 . 0. BURRAU 340 

W. JOHANNSEN 1867—1927 .THE EDITOR 849 

STATISTICS AND BIOLOGY.W. JOHANNSEN 851 

THE CENSUS OF ICELAND IN 17(S.T. THOR8TEINSSON 362 

THE CENSUS OF POPULATION IN NORWAY IN 1769 

H. PALMSTROM 871 

POPULATION REGISTRATION .0. AMNEUS 881 

PaPULATTOiN REXHISTRATTON IN DENMARK 

K. DALQAARD, OHR. BONDE 400 

POPULATION REGISTRATION IN FINLAND . M. KOYERO 436 

POPULATION REGISTRATION IN SWEDEN .... THE EDITOR 442 

AGRICULTURE IN THE NORDIC STATES.THE EDITOR 449 

FORESTS AND FORESTRY IN SUOMI (FINLAND) 

A. K. OAJANDER 529 

FORBSTS AND FORESTRY IN SWEDEN . F. AMINOFP 636 

FORESTS AND FORESTRY IN NORWAY.J. K. 8ANDM0 647 

irraONG IN THE NORDIC STATES.AAGE J. 0. JENSEN 664 

RaSOIDRaBS IN THE NOOEUHO STATES .. P. GEIJER 681 

WA.^ POWER IN THE NORDOTO STATES. 9. VELANDER 687 

SSimNG IN THE NORDIC STATES.A. 8KDIEN 601 

LI^ G COSTS IN THE NORDIC CAPITALS .... E. 8TORSTEEN 606 
THE NORDIC PEOPLES . THE EDITOR 621 


4.fio 



























ARTIOLPS IN NOEDISK STATISTISK TIDSKEIFT. 

VoL 1 

STATISTXKXSBEXNCI-, Brer till John Bums fr&n ... UTOiVABiir 

lEKTATISTlOl^iATION, ILetter to John Burns from . Tfia BdivOb 

DIB VABIATIONSBBBITB BBlM GAtTSSSOHEN BIIHLEIBGBISHTZ, I 

D. V. Bobtkibwxoi 

DAS OBRIQTZ DBR OBOSSEN ZAHLHN DND DEB arOOHASTISOH-STATISTISOHH 

STANDl^UNKT IN DBB MODBBNBN WISSBNSOHAFT . Al, A. TaoHinPBOW 

STATISWIOS AND PBKHISTOBIO SOIENOB . O. MoiraiMiri 

BIOLOGI OG STATISTIK . W. 

STATISTIK OG HISTOBIE . A. 0. J0HKS1» 

DEN ISDANDSKIQ STATISTIKS OMFANG 00 VILKAAB.. Thobstbin Thobstiinssok 
BBB’OLKNXNOSBTATISTIKBN I FINLAND, BEOBOANISATIONSPLANEB 

A. E. Tin>Bn 

NOBDHXNNEN I YXBLDEN ... Thob Akubbssok 

JOBDBBDKETS UTVEOKLINO I VISSA DELAB AV SKANB OOH DANMXBK 

Ebkbt HdUlB 

DIE ALLRUSSISOHEN LANDWIR/TSOHAFTSZXHLDNGEN VON 1916 UND 1917 

Stan. Eohii 

INTEBSKANDINAVISK HANDELSSTATISTIK 1912—1918 . Johs. DAiaow 

LEHBBOOHEB DEB fifTATISTIK . Al. A. Tsoktjpbow 

DIE VABXATIONSBBEITE BEIM GAUSSSCHEN FEHLEBGESET^ II 

L. r* Bobtkzxwioi 

STATISTISKA SAMEtTNDET I FINLAND . A. B. TtmaiB 

DEN NOBSKE OVERSJtilSKE UTVANDBINQ . B. Stobbthnit 

SVBNSKA JOBDBNS XGABE OOH BRtTKARB . Pattii Dahn 

LBHBBOOHEB DEB STATISTIK . AL. A. TsOHXJPBOW 

1ST DIE NORMALB STABILITXT BMPIBISOH NAOHWBISBAB Al. A. Tsohupbow 

ON THE EFFEOTIVITT OF WEATHER WARNINGS . A AngstbOu 

BIKRSTATtSTIKENS OBNTBALISERING I AMERIZAS FOBENTA STATER 

ET FOLEEREGISTER I DANMARK. JOHS. DALHonr 

DEB EINFLtTSS DES KBTEGES AUF DIE GEBTTRTBN . B. Obubib 

VoL 2 

WABSOHEINLIOHKBIT TOD STATISTISOHB FOBSOHUNG NAOH KEYNES 

L. V. Bobtjcuwioi 

AUFGABBN UND VOBAUSSETZUNGEN DEB KOBBBLATIONSMBSSUNG 

Al. a Tsohupbow 

BOOIADSTATISTIKENS OBNTRALISBRING OOH SOOIALBTYBBLSBNS INDBAG- 

KING 1 FINLAND ..... Thob Anobbssok 

FORHOLDET MBLLEM KJ0NNBNE I DEN STABNDB BBFOLKNING OG SBKSUAL- 

PROPOBSJONBN FOR DB F0TTE . Inovab Wbubevang 

DET SVBNSKA FODELSBOVBRSKOTTETS UTKOMSTMOJLIGHBTBR I EGBT LAND 

Fb. Sakdbbbg 

BIN BCROEBLIOHEB HAUSHALTTOGSAUFWAND . B. OaUBiB 

SVEBGES HANDELSSTATISTIK OOH DE STATISTIKSAKKUNNIGA Thob Andbbsson 

SAMFXRDSELNS PEBIODIOITET . Y. Ntlandwi 

BUSINESS STATISTICS .*. Al. A Tsohupbow 

FOLKBEGISTRERINGEN I NORGE ... G. Aitirtui 

FOLKOMBOSTNINQEN DEN 27 AUGUSTI 1922 ANGAENDE BUSDBYOESFOBBUD 

Otto GbOnlunb 

BIKSSTATISTIKENS OBNTBALISERING I FINLAND . A B. Tudbib 

BIKSSTAfTISTIKENS OENTRALISEEING I CANADA . Thob Anpbbssox 

ZWEOK UND STBUKTUB EINEB PREXSINDEXZAHL, I . L. r. Bobtkibwioi 

ABBEIDSBESPARENDE METODBB I STATISTIKKEN . Adolph Jbksbit 

OM MIDDELFEJLEN VED PABTIBLLE UNDEBS0GELSEB . Hans Ol. Ntb^llb 

FORSLAG TIL OIVILSTANDSBBGISTBBBING I NORGE .. G. Amrtus 

K0BENHAVNS FOLKEBEGISTER . Bbbtbl Dahlgaabd 

VAXTODLINGEN T SVERGE . Bbnbt H5otb 

INTEBSKANDINAVISK HANDELSSTATISTIK 1912—1922 . Johs. Dalhopp 

Vol. 8 

DET INTERNATIONALE STATISTISKE INSTITUTS M0DB I BRUXELLES I OK- 

TOBEE 1928 .. Adolph Jbksbh 

DANSKE STATISTIKBREfl FORBNINQ . H. H0ST 

STANDSREGISTRBRINGEN I UTLANDBT . G. Aionfui 

JBBNKONTORET OOH BBBGHANTBRINGSSTATISTIKBN 

8TUDIER X 8VENSK ALKOHOLSTATISTIK 1, 2 . HAns Gahn 

GRUNDBEGRIFFE TOD GRUNDPROBLEMB DEB KOBRBLATIONSTHBORIE 

Al. a Tsohupbow 

ZWEOK UND STBUKTUB FINER PREISINDEXZAHL, II. L. T. Bortxibwioi 

THE FOREST RESOURCES OF SWEDEN . Tob Jonson 

THE ORE RESOUROFiS AT THE KCIRUNAVAARA AND GELLIVARE HINES 

Walib. Pbtbbssoh 

SVEBGES POSTVASEN 1620—1924 . Y. Ntiandm 

STATISTICS OF INDUSTRIAL PRODUCTION . Adolph JzHsair 

EN BOK OM KOOPBRAflCTONEN ... Oubt Bothlto 

ZIBLB TOD WEGE DEB STOOHASTISOHBN GRUNDLEGTOG DEB STATIS- 

TISOHEN THBORIE . Al. A Tsohupbow 

ZWEOK UND STBUKTUB EINEB PREISINDEXZAHL. HI . L. V. Bobtkihwioi 

FBIL I DET SBFOLKNINGSSTATISTISKB MATERIALS . Hbnbik PalmstbOh 

UNDERSOKNING RGBANDB DEN ANIMALISKA PBODUKTIONENS STORLBK I 

SVERGE, I . Ebhsi? HOutbb 

THE PROSPECTS OF THE PAPER INDUSTRY . Hans Ansteim 

STATS- OG KOMMUNBBEGNSKABERNB T DB NORDISKB LANDS Ohbistiah OLav 

'Vol* ^ 

SVBNSKA FOBSXKRINGSFORENINGE N OO H STATISTIKEN ...... Thob Ahobbssoh 

WILHELM LEXIS UND SEINE BBDBUTTOG FtJ-R DIB VERSIOHEBTOGSWISSEN- 
SOHAFT ... W. Lobby 


















































TIL BBLTSNING AF FOBHOLLIllT MELLBM lAGTTAGUlLSBlSLJBRB 00- FORaiK* 

RINGSTBOBl . OaaXi Bxnuyk.9 

BVBNSKABNAS UTBBEDNXNG I NORBAMBRIKA . HsLOs N>LBOir 

SYSBSTATIBTIEl ... M. OBua$l*jU» 

BBANBFttBSAKRINaSSTATISTIKBN I SVERGE . HaimiK UvuaAt 

BXT GLEMT STATISTISK ARBEIB OM NORSK SJ5F0RSIKRING . K. hOUAmn 

OLASSIFIOATION Bt OCCUPATIONS AND INDUSTRIES AT THE GENERAL 

CENSUS . Raokvaiud JOKsnaBM 

DAS GESOHLEOHTSVERHALTNIS DBR GEBORENEN ALS GEOENSTAND DEB 

STATISTISCHBN PORSCHUNG .... Jkh. A. TeOllXirBOW 

DEN BORDIGA MARKENS FORDELNINO I FINLAND . A. K, OA;rAND»a 

THE DISTRIBUTION OP FERTILE SOIL IN FINLAND . A. K. Cajawdii 

BIKSSKOGSTAXBRINGEN I SVBRGB . TOE JOhSow 

PAPFERSMASSBINDUSTRIEN I NORRLAND . Hans AnsiSn 

NOTES ON FINANCIAL STATISTICS FOB THE NORTHERN COUNTRIES 

0XOa81*tAN OUUBN 

BUSINESS FORECASTING . Ai, A. TsoHXrSow 

THE REPRESENTATIVE METHOD IN STATISTICS . AooidPH Jenioiv 

THE REPRESENTATIVE METHOD IN PRACTICE . Adolph JKNiSair 

Vol. 6 

SANNOLIKHETSKALKYLBN I DEN VBTENSKAPLIGA LITTBRATURBN 

KOMITBEN TIL ANSTILLELSB AV UNDERSOKELSBB VEDRttRENSwB^^NO^Ss 

and FINANCIAL CONDITIONS IN NORWAY g: rJSS 

THE NORWEGIAN HARVEST STATISTICS AND THEIR RB-ABRANGBl&l^ 

DB SVBNSKA POLKSKOLBSBMINARIBBNA . 

15®PRODUCTS OP THE SWEDISH EXPORT TRADE .../‘"mwj anm!£I 
DE NORSKS LIVPORSIKRINGSSBLSKAPBBS KAPITALANBBINGE^B * Anstnin 

A. A. TSOOTPBOW t 

OTATISTIKPBOFESS-OBBBNA I STBbIe .. 

ON THE ANTHBOPOLOOT OF THE ISLAND OF BOBNHOuii’it; SaST7MmS??SS 

SKOLST^^r™ SJBTBBBBUKBT I STEBOB 00 HOBGB . a bSS^ 

SJOrpBSIKMNGBN'lNOBOBDNDEB HHikONiijNKT^ ^’LoStS! 

NAGBA PBAKfisKA BBSOLTAT FBH's-viBGBS'BHMskoOBTAXBBIW^ 
BOSTADSSTATISTIKEN I SVBBQB . 

STATISTISKA PROVNINGSANSTALTBR ... SE?!? AKDBBSSOlfr 

STATISTIKEN I ITALIBN OOH DESfl nwwrw aVVow®V;;« . Ain>»BSSd» 

REGISTER TILL BAND 1 -^ Ain>»wigoir 

INDEX TO VOL. 1—6 

Vol. 6 

POLKBREGISTER OG BEPOLKNINGSSTATTsrpTTT -r.. 

STATISTICS OP THE UNIVERSITY’’op'iiiwT. ANI>»»8aoir 

ST^ISTIKBN VID LINNES UNI^iP^TWT®^ . S®* AHi«BB»BOir 

AV INNTEKTSSTATISTTKKENS METOnJnS^'iViw”**;;*-'*;****.*'*** AKDaESBOir 

BEOT^amofsVlBKloMMVE°°I^DAN25^0^^^ 

w. lOHANNSBN f 0* STHawsTBUP 

tovandbing . A,o«.H 

5?55®' BARNBHTQIBNB‘l* OSLO* RY.;;. G.^^Kitra 

HATBBIALET FEAN BIKSSKOGSTAXEE?NGBN OOH DESS BBAEBBraiNG®’’'' 
NORDENS POLKRAKNINGAR 1920. 2 

Vol. 7 . 

DETEEMINATION op the DEGBEB op OBBDIBILm- OF NOBMAL SBBIBS 
STATISTIK OOH jl-ILITIK Gjbemoh 

SEKSUALPROPORTyow''R^T''-nww’*«w^;;;;;«. Thou Ahd»B8S0h 

D6DELIGHBTEN AT TOBEEKOLOB ®0®G ^^Gb’sSeN iwf 

§r^S9’SBRUKSTELLlNGEN I NORGE PALHSVEttM 


JoaaV ^STUJMTD 
Thob Ahobbbsoh 


















































463 


Nordisk Statistlsk Tidskrift started in 1922. It is chiefly written 
m Nordic tongues. There are also published articles in English 
and German. To some articles in Nordic there are summaries 
in English or German. Now the chance is taken to realize the 
original scheme of publishing two editions, one in Nordic tongues 
and the other in English. The edition in non-Nordie tongues 
is published in English also because of the fact that the millions 
of descendants of the Nordic peoples, now living beyond the 
boundaries of the Nordic states, are mainly working in English* 
speaking countries. 


Nordic Statistical Journal has five departments: articles, reviews 
of books, minor communications, bibliographical lists of Nordic 
statistics, and recent periodicals and new books. In general, all 
departments will be represented in every number. 

Nordic Statistical Journal is published quarterley, the four 
numbers making a volume of about 640 pages. The subscription 
rate for a volume — post free — is 30 Swedish crowns. 

Subscriptions may be sent to Nordic Statistical Journal, Stock¬ 
holm, Sweden. 

The subscription rate through booksellers is 35 Swedish crowns. 

Editorial commurdcations and all publications should be adressed 
to Thor ANDERSSON, Dr. Ph., Stockholm, Sweden. 


Aitonbl. tr.. SthM 1929. 



'Hordk 

Statistical Journal 

EDITED BY 

THOR ANDERSSON 

VOLUME 2 PARTS 1 & 2 


EDYABD FHBA.6HEN ... THE EDITOR 

GTJSTA,V AMNEUS 1865—1938 . THE EDITOR 

Y. B. GA3£BOBO 1866—1929 . THE EDITOR 

AEYID THOBBERG 1877—1920 . THE EDITOR 

T j EXI S UND DOEMOT. L. V, BORTKIEWIOZ 


ON THE TECHNICS OF THE CAIiCULATION OF MOMENTS 

P. J. LIMDER8 

ON THE COMPOSITION OF TWO NORMAL FREQUENCY CURYES, 1 

F. J. LINDERS 

ABRUPT CHANGES IN LEYEL OF TREND . EILIF QJERMOB 

OFFICIAL STATISTICIANS’ INSTRUCTION IN SWEDEN THE EDITOR 
MECHANICAL AIDS TO STATISTICAL WORK VALTER LINDBERQ 
MATHEMATICS, STATISTICS, AND INTERNATIONALISM 

K.-G. HAQSTROM 

THE FOREIGN LITERARY LANGUAGE IN THE SWEDISH OFFI¬ 
CIAL STATISTICS . the EDITOR 

POPULATION REGISTRATION IN SWEDEN. THE EDITOR 

ON CHARACTERISTIC POINTS AND LINES OF THE GEOGRAPHICAL 

DISTRIBUTION OF A POPULATION . F. J. LINDERS 

ON HEAD MEASURES OF MALES IN SWEDEN .. F. J. LINDERS 

STUDIES IN MATRIMONIAL FECUNDITY. H. PALMSTROM 

POPULATION INYESTIGATIONS REGARDING INYALIDITY- AND 

OLD AGE INSURANCE . 0. A. AKESSON 

THE SWEDISH MORTALITY INYESTIGATIONS OP ASSURED MATE- 

MAI* . H. PRAWITZ 

SOME FEATURES OP THE DEVELOPMENT WITHIN THE TECHNICS 

OF DANISH LIFE INSURANCE, 1. CARL BURRAU 

SOME FEATURES OF SWEDISH LIFE INSURANCE TECHNICS 

HARALD CRAMER 

THE DEVELOPMENT OP NORWEGIAN UFE INSURANCE TECH- 

••• . FR. LANGE-NIELSEN 

THE DEVELOPMENT OF LIFE INSURANCE TECHNICS IN PIN- 

. E. KEINANEN 

STATISTIOS AND AGRICULTURE IN SWEDEN. THE EDITOR 

4-84- 



















EEPEINT AND TEANSLATION FEOM NOEDISK FOE- 
SAKBINGSTIDSKEIFT 1930. 

Nordic Statistical Journal. Volume 1. Edited by Thoe Andkhs- 
SON. Stockholm 1929. Pp. 639. Reviewed by Dr. phil. Gael 
Bueeau. 

We, the inhabitants of the Nordic countries, are perhaps 
somewhat inclined to take a certain inner pride in our — 
as it seems to us — high civilization and to attach stiU more 
importance to ourselves in this respect during the later years, 
when the "Ragnardk” of the great war had devastated moat 
of the other civilized countries and handicapped them in their 
competition with us. Let us hope that there are some good 
grounds for our selfsatisfied opinion 1 It is not difficult to 
find some facts indicating that we are right in this self-respect, 
even if we go to the very summits of civilization — let us 
think of the "Acta matematica”, for instance. But if we 
are right, it may be very necessary for us to be on our 
guard against the danger of stagnation, of the standstill, 
where we begin to lull ourselves into the pleasant dream that 
our position is unshakeable, and that we may now repose on 
our laurels. Therefore, we must honour the persons who do not 
allow us to go to rest, the persons who spur us on to do Oiur 
very best. 

Thor And&rsson is one of those whom we must honour for 
such an influence. In the field of statistics he seeks to be our 
scientific conscience. He swings his whip over our heads 
mercilessly and drives to activity everybody who is able to 
produce something, however small or great, within the field 
of statistics. But he is not content with that I He is not con¬ 
tent with the achievement of having filled a long and im¬ 
posing row of volumes of the "Nordisk Statistisk Tidskrift” 
with valuable essays and treatises written by Scandinavian 
as well as by leading foreign authors — all the non-scandi- 
navian countries are now to see and feel the warmth of the 
light from the North. His journal is now to become an inter- 

A85 



466 


NOBDIO STATISTICAL JOUENAL 


national publication, but still with an indication of its JSTordic 
origin in its title. Tbe first volume of the ’’Nordic Statistical 
Journal” — simultaneously forming the 8th volume of the 
original journal — has appeared. And it is not a trifling thing, 
this volume of 639 pages in great octavo 1 It is great in its 
composition, still more soaring in its purposes and ends for the 
future, and promising, when wo consider what ’’the man at 
the wheel” has collected in these 639 pages by means of an 
unusual perseverance in unfailing love for the task and in 
spite of many — too many — external adversities. 

The leading thought of the work is the same as, now soon 
a decennium ago, led Thor Andersson to found the Nordish 
Statistisii Tidshrift. It is a child of the Greeks’ idea of 
chaos and cosmos, or rather a consequent, modern continua¬ 
tion of this idea. Statistics is the most important means for 
bringing our existence over from chaos to cosmos. Statistics 
acquaints us with the real circumstances, and the knowledge, 
the real knowledge of the things, will then show how to> 
bring things in their right places, so that the entirety becomes 
the arranged cosmos. But there is still much to do I We have 
not yet been able to elevate statistics to the rank of an observ¬ 
ing natural science it should have, to be able to give us 
the real science of the things, alluded to above. In 1922 the 
thought was to be in the front-rank in the work for this pur¬ 
pose. And we have to be obliged to Thor Andersson for the 
strenuous work he has performed for his idea during the past 
years, and now it will be done on a still broader basis, i. e. 
for an international public, yet under Nordic leadership. 

Let us study a little more closely how this new volume 
I seeks to perform its work in the service of the mentioned 
idea. 

With, in a good meaning, a journalistic feeling for actualities 
the volume appears as a sort of jubilee-gift to JBortkiewicz 
on his sixtieth anniversary and it is therefore opened by a 
good full-page picture of this scientist who has given s i 
valuable contributions to the original journal. To the readoi, 
the following essays seem to arrange themselves into three 
groups which — just in order to give a name to the special 
groups — could be designated as olden times, present times, and 



NOBDIO STATISTICAL JOtTEITAL 


487 


In the group belonging to the olden times, the editor seeks 
to show how deeply rooted the statistical science is in the 
Nordic peoples by introducing a number of great men of 
Nordic origin, each in his way, a pioneer. These men are 
lepresented partly in full page pictures, partly in the text. 
It maj’' not surprise us that none of them is a ’’piofessional 
statistician”, for the profession is only now being created. 
But they belong, each in his way, to the founders of this 
branch. In the eightteenth century Wargentin, the astronomer, 
founded population statistics which is of fundamental im¬ 
portance to demographics. The essay on him is particularly 
vvrell written by NordenmarJc. 

The memory of the now nearly forgotten EUert Smdt, who, 
by his activity as a clergyman, was brought to make scienti- 
fical investigations of the sodety where he lives, and who 
gradually becomes a social-statistidan of high rank, is re¬ 
vived both by a reprint of his peculiar essay of 1858; ”On 
Piperviken and Kuselokbakken (Investigations of the condi¬ 
tions and morals of the working-class in Christiania)” and 
by a scientific estimation of him (’’Eilert Sundt’s law”) by 
Ihjgg who also gives an instructive account of how Sundt 
was disfavoured by his contemporaries, naturally in the first 
place by the politicians who had to do with the granting of 
money for his investigations! Unfortunately the politicians 
of the present times are not better; about that Thor Andersson 
himself could write a sad chapter! 

Then follows ThMe, whose principal sdentiBc passion, ’’the 
theory of observations”, is simply the foundation of what is 
now more generally called mathematical statistics, and finally 
,/ohannsen, the investigator of heredity, who is commemorated 
Ity a picture as well as by a reprint and a translation into 
English of his contribution to the first volume of the original 
journal: ’’Biology and statistics”. 

Several other essays like those mentioned, also belong to 
Iho olden times. Thus if. Palmstroms’s essay on the first census 
in Norway in 1769 and that of Thorsteinsson on the census of 
Iceland ill 1703. 

The essays which the reader naturally refers to the ’’present 
times” are evidently caused by the editor in order to 
-how thft uon-Scandinavian world the conditions in the 



-468 


NOEDIC STATISTICAL JOURNAL 


Nordic countries in two respects, both extremely import^t 
from a statistical point of view: the population registration 
and the industries, thus, firstly, how we gain our knowledge 
about the number and the composition of the population, 
and, secondly, how these people support themselves. 

The editor could not have found any person more fit to 
write the ’’general” article about population registration than 
Anmius, the director of the Oslo population register, whose 
institution is up to the standard and also has served as a 
model in many places, among others in Itenmark. The condi¬ 
tion of these matters in the -different countries is further 
treated by the editor as far as Sweden is concerned, and as to 
Denmark by not less than two authors. Bonds and Dalpaard, 
and with regard to Finland by Kovero. These are very 
instructive essays which illustrate the importance of these 
things in the right way. One learns how even the ’’torso” 
(a not unjustified epithet for the arrangement imroduced 
in Denmark, which was originally exceUantly planned, 
but which has been more than half-way broken to pieces by 
small min ded and short-sighted politicians, of course under the 
pretext of economy) of a population register, as that of Den¬ 
mark, thanks to tire fact that it is obligatory, gives an ex¬ 
cellent support in many ways, among others for the 6-year 
censuses. One learns how deplorably far behind matters are in 
Wargentin’s native country, where it was naturally necessary 
to perform registration in the large towns, but how the 
acclomplishment of the work is hazarded by the rather an¬ 
tiquated and burdensome obligatory collaboration with the 
clergy — and by still many other things. A survey like this, 
presented to an international audience is perhaps more than 
anything else suited to advance the population register mo¬ 
vement, which shall, however, once triumph by its inner 
necessity. 

Next the editor has intended to give a picture of the 
industrial statistics of the Nordic states. He has himself 
undertaken to treat the most important part: ’’the mother 
industry”, agriculture. 

Aage I. C. Jensen treats fishery. STeoim shipping, Geijer the 
ore resources of the Nordic countries, Velander the water 
lowers of the Nordic countries, and, forestry, finally, is treated 
by Aminoff (Sweden), Sandmo (Norway), and Gajander (Finland) 



NOBMC STAmSTIOAL JOUBNAL 


489 


To the "present times” we may also count Storstem’s, to us, 
the inhabitants of the Nordic capitals very interesting article 
on "The expenses of living in the Nordic capitals”, a suhjeot 
full of pitfalls for a less experienced statistician, but here 
treated with excellent fineness and with a clear presdenoe of 
the difficulties. 

To the same group belongs Thorl&rg: Statistics and trade- 
union movement”, a rather short but extremely interesting 
essay, not least on account of the author’s position as 
president of the national organisation of the Swedish trade- 
unions. Hero fall the weighty words about the social-political 
institution eroded by the League of Nations for internatio¬ 
nal labour organisation, that "the work of this organization 
is rendered extremely difficult by the fact that it has hitherto 
been almost impossible to arrive at any comparability between 
tho statistics of the different countries”. 

Finally there is Under’s: ’’Some remarks on the incomO 
statistics of the census in 1920”, which, according to its title, 
seems to belong to the present times, but which, according to 
its contents, is in the first place, a scientifical arithmetical 
example for the illustration of the applicability of Pareto’s law, 
concluding in some wishes with regard to the future offidal 
investigations of income. This essay can therefore be said 
to form the transition to the last group. 


When I have permitted myself to designate the third group 
of essays as ’’future”, there may thei'eby, as a matter of fact, 
not bo und<'rstood any paradoxical possibility of prophesying 
tho slatisticss of the fuLuro. 1 have only wished to emphasize 
tho editor’s desire that his journal may also be one of the 
laboratories where tho instruments for the treatment of the 
future statistics are created. Tliis side of the matter has al¬ 
ways had the odiloi'’s supreme interest; you may think only 
of the contributions to the previous volumes, which Bortkie- 
wicz, Tschuprow and others have brought. We can call this 
side tho tlieondical or perhaps the mathematic-statistical one. 
ft has been an urgent need in this volume I to show, that 
also wo, in the Nordic countries think of this side of the 
matter, and among the authors of the six essays of which 
this group consists (besides the above mentioned contribution 



490 


JSORmO BTATISTlOAt, JOITENAI/ 


by Linders) we also Xiiid all Ihe four Nordic (“ouiitrics rcpro 
seated. 

Although the chief importauce of ircatibcs of tliLs kind 
would seem to tall within the realurs of lli(‘or.\, still one of 
this essays, namely NjibiiUcx ’’Intorpolatioii in .slatisii(‘„s" is 
of rather eminent practical importani'c and use. liesich' 
or rather because of — his clear and sharp differoutiatiou bet¬ 
ween the purely mathematical and the statistical meaning of 
the word interpolation and the very near connection of the 
last mentioned notion with that of adjustment, he hero gives 
exclusively practical advice and instructions ueeful in cir¬ 
cumstances which, so to say, belong to the every day life 
of the statistician. This treatise will ho very welcome to 
many colleagues. In some opposition hereto .stands llaqiwr 
Frisch: ’’Correlation and scatter in statistical variables", in 
size as well as in importance one of the biggest treatises of 
the volume which will prick the conscience of many as it 
will make them clearly feel the obligation to penetrate more 
deeply into its contents — the author himself namely tells us 
that he has come ’’to various results, some of which are 
known, and others which arc new, so far as I am aware" — 
but will easily feel frightened by the author’s imposing mathe¬ 
matical apparatus, which is nothing less than n-dimensional 
vectors and appertaining matrices and orthogonal transforma¬ 
tions. But one ought not to be frightened by these heavy im¬ 
plements. And this so much the less as the author presupposes 
no elementary knowledge in the field of vector calculation. 
On the contrary, he explains his whole apparatus thor¬ 
oughly. There is no need of having hcani words as vector or 
orthogonal before, and one will still bo able to study this 
work, which, on this account has naturally become somewhat 
extensive (67 pages). It may be greeted with great satis¬ 
faction that one thus begins to attack the problems of 
theoretical statistics with such weapons. Terms such as 
scatter and correlation are so fundamental to science that wo 
cannot take into use an apparatus precious enough to atto k 
the problems contained in these terms. Here the best is 
not too good. 

Further we meet Jerneman: "To the method of sampling” 
and Lindeherg; "Some remarks on the mean error of the per- 



NOBBIC STATISTICAL JOUBKAL 


491 


cent age of correlation”, essays well suited to waken respect 
for Nordic science abroad. 

The contact with the kindred science, social economics, is 
in the volume attended to by Gjermoe: "The amplitude of 
industrial fluctuations”, an essay covering 63 pages, which 
struggles with the difficult and not yet very well defined 
notions: times of ascent and descent, crises, fluctuations, and 
so on. The notion of ’’trend”, so prominent in all the ultra 
modern investigations, belongs to the here used apparatus, 
and in a quotation — this essay is very abundant in quota¬ 
tions, which will be praiseworthy — we learn that it was 
already found by Hooker in 1901 and was defined by him as 
’’the direction in which the variable is really moving when 
the oscillations are disregarded”. 

As rather specially belonging to the ’’future” we meet fi¬ 
nally Angstrom's brilliant essay on meteorology and statistics 
and have a presentimient that the latter is destined to play 
once the principal rdle before the former. 

Wo will not end this review without another congratulation 
to Tlior Andersson for having brought forth this volume I, 
followed by the wish and hope that his ideal struggle for the 
highest aims will meet with the wished for success before he 
grows too tired to fight against adversity. It is sadly known 
that the Swedish Riksdag has not granted him the necessary 
subsidy in spite of the fact that such recommendations as the 
following one could be appended to the petition: 

”By the way in which Dr. Thor Andersson has edited Nor- 
disk Statistisk Tidskrift, he has, according to our opinion, 
I'cndered great services towards the advancement of scienti- 
fical statistics and towards the spread of the knowledge of 
its extraordinary importance, not only for other sciences, 
but also, and not least, for the obtaining of a real know¬ 
ledge of the social and economical structure of society, a 
knowledge that is necessary if the public measures of correct¬ 
ing social evils and of furthering industry will have the 
wished for effect. His name guarantees that the journal 
will, also in the future, hold the same prominent position as 
it now holds among publications of this kind. 

Stockholm January 17th, 1929. 

E PhragmSn P. 6r. LauHnP 



49 £ 


NORDIC STATISTICAL JOURNAL 


When not even this was of any use, on© does not know what 
to do. Judging by former experience, it would b© of still 
less use to refer to the fact that this journal is an honour for 
its country (and for the Nordic states in general) as there 
IS not like it in any of the foreign countries. For from such 
an argument, presented by a favourer of Eilert Sundt in the 
Norwegian Storting, the short-sighted politician Jaabaek <lrow 
the conclusion that it was surely so because such things 
were superfluous! The grant to Sundt was denied and he 
withdrew to a clwgyship. Will the present politicians really 
dishonour themselves still more in connection with this en¬ 
terprise? Let us hope that the means will be foiund for 
carrying on this great work. 


Sthlm, Aftonbladeta tr, 19S0 





I. A. R. I. 75 


IMl’ERIAF^ AOiUOIJLTUHAL RESEARCH 


TNHTc'l'U I’E LlBliARY 

MEW DELHI. 

Dato of iiHsiio. 

. 

/.,^r • 

Jk.J'/r./ 

Dato of isauo. 

Dato of issue. 




. 

. 

. 



. 




. 

. 






























